File writing: display Unicode characters in finished file? [duplicate] - c#

This question already has answers here:
How to decode a Unicode character in a string
(3 answers)
Closed 1 year ago.
I'm trying to create flash cards to memorise Japanese kanji characters, and for that I'm crawling Jitenon, a website containing tons of kanji definitions, pronunciations and meanings. I've coded up the classes that would hold the relevant information that can be found on each kanji page, and I'm currently trying to save my list of kanji as a json file.
For testing purposes, I'm trying to parse individual kanji objects like this:
...
var kanji = scraper.GetKanjiDefinition(kanjiUrl);
var jsonOptions = new JsonSerializerOptions() { WriteIndented = true };
var kanjiJson = JsonSerializer.Serialize(kanji, jsonOptions);
File.WriteAllText("kanji_json.json", kanjiJson, Encoding.UTF8);
Here's an example page that I've crawled with its corresponding json serialization:
{
"Character": "\u697D",
"MainRadical": "\u6728",
"Strokes": 13,
"KankenLevel": "\uFF19\u7D1A",
"Education": "\u5C0F\u5B66\u6821\uFF12\u5E74\u751F",
"Meanings": [
{
"Indices": [
"\u301C"
],
"Meaning": "\u304A\u3093\u304C\u304F\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u304B\u306A\u3067\u308B\u3002\u97F3\u3092\u304B\u306A\u3067\u308B\u3002\u6F14\u594F\u3059\u308B\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u305F\u306E\u3057\u3044\u3002\u305F\u306E\u3057\u3080\u3002\u3088\u308D\u3053\u3076\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u3053\u306E\u3080\u3002\u611B\u3059\u308B\u3002\u306D\u304C\u3046\u3002\u6C42\u3081\u308B\u3002",
"JapanTypical": false
},
{
"Indices": [],
"Meaning": "\u65E5\u672C\u3089\u304F\u3002\u305F\u3084\u3059\u3044\u3002\u5FC3\u8EAB\u306B\u82E6\u75DB\u304C\u306A\u304F\u3001\u306E\u3073\u306E\u3073\u3059\u308B\u3002",
"JapanTypical": true
}
],
"Readings": [
{
"Reading": "\u30AC\u30AF",
"Yomi": 0,
"MeaningIndices": [
1
],
"Education": "\u5C0F"
},
{
"Reading": "\u30E9\u30AF",
"Yomi": 0,
"MeaningIndices": [
2
],
"Education": "\u5C0F"
},
{
"Reading": "\u30AE\u30E7\u30A6",
"Yomi": 0,
"MeaningIndices": [
3
],
"Education": "\u5C0F"
},
{
"Reading": "\u30B4\u30A6",
"Yomi": 0,
"MeaningIndices": [
3
],
"Education": "\u5C0F"
},
{
"Reading": "\u305F\u306E\uFF08\u3057\u3044\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u305F\u306E\uFF08\u3057\u3080\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u304B\u306A\uFF08\u3067\u308B\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u3053\u306E\uFF08\u3080\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
}
]
}
I'd like to have the actual Japanese text included in the json file, for example "Character": "楽" and "MainRadical": "木" and "KankenLevel": "9級" instead of the escaped Unicode characters like \u697D. How could I achieve this in .NET?
If it makes any difference, I'm on Ubuntu 20.04 LTS and I open my json files in VS Code 1.56.2.

Set the Encoder in your jsonOptions:
var jsonOptions = new JsonSerializerOptions() {
Encoder = JavaScriptEncoder.Create(UnicodeRanges.All),
WriteIndented = true
};
The above one allows all UnicodeRanges

Related

Desearalising JSON with arrays in arrays

I've spent a while trying to work out how to desterilise this JSON file and get each products
ID,NAME,COLLECTION COST, AND WEATHER ITS ON OFFER
Im currently attempting this with Newtonsoft.Json in a C# class.
could someone please point me in the right direction.
Many thanks
for the JSON text visit:
https://pastebin.com/bkQwpsAy
{
"_embedded": {
"products": [{
"uuid": "74f1501f-4a52-490a-b2b9-668f16e2db95",
"createdAt": "2020-04-20T13:44:22.000+00:00",
"itemId": "DRN543",
"altItemGroupId": "",
"popularityScore": 101.22,
"itemName": "Coca-Cola Bottles (GB) 6x1.5L",
"jsonFriendlyItemName": "Coca-Cola Bottles (GB) 6x1.5L",
"seoFriendlyItemName": "coca-cola-bottles-gb-6x1-5l",
"brand": "Coca Cola",
"imageLinks": ["https://jjproducts.global.ssl.fastly.net/jjfoodservice/image/upload/v1588074299/products/DRN543/_S/ggktoxjzbplky4uovec4.jpg"],
"price": 9.99,
"depth": 29.0,
"height": 32.0,
"itemNetWeight": 9.0,
"jadMobileItemName": "Coca Cola (GB) 6x1.5L",
"category1Id": "100005",
"category2Id": "200261",
"category3Id": "300194",
"category4Id": "400084",
"category5Id": "",
"category1Name": "Drinks",
"category2Name": "Soft Drinks",
"category3Name": "Fizzy Drinks",
"category4Name": "Cola",
"category5Name": "",
"origin": "United Kingdom",
"catchOrigin": "",
"productDescription": "",
"sellingPoints": "Coca Cola\nCocaCola",
"shelfLife": 135,
"sizeOrCut": "6x1.5l",
"qtyPerLayer": 20.0,
"standardPalletQty": 80.0,
"unitVolume": 17632.0,
"width": 19.0,
"allergensDeclaration": "",
"storageCondition": "Store cool and dry.",
"storedAt": "2020-11-03T23:52:28.789210Z",
"branches": [{
"locationId": "EN-MW",
"locationName": "Enfield Branch",
"warehouseArea": "DA",
"warehouseZone": "GZ"
}, {
"locationId": "LS-MW",
"locationName": "Leicester Branch",
"warehouseArea": "DA",
"warehouseZone": "GZ"
}
],
"branchesBeforeLastUpdate": [{
"locationId": "EN-MW",
"locationName": "Enfield Branch",
"warehouseArea": "DA",
"warehouseZone": "GZ"
}, {
"locationId": "LS-MW",
"locationName": "Leicester Branch",
"warehouseArea": "DA",
"warehouseZone": "GZ"
}
],
"video": [],
"categoryList": "[{\"id\":\"100005\",\"name\":\"Drinks\"},{\"id\":\"200261\",\"name\":\"Soft Drinks\"},{\"id\":\"300194\",\"name\":\"Fizzy Drinks\"}]",
"categoryId": "100005,200261,300194",
"categoryName": "Drinks,Soft Drinks,Fizzy Drinks",
"categoryNormalised": "[Cola Drinks Fizzy Soft]",
"productFeatures": ["Popular", "Ambient", "Vegan", "Vegetarian"],
"unitSize": "1.5L",
"unitPriceDivider": 0.16666,
"unitPriceTypeDisplayText": "each",
"offer": {
"itemId": "DRN543",
"promoForCc": false,
"promoTagId": "Monthly",
"promoTag": "Monthly Special Promotions",
"promoEnd": "31/12/2020",
"promoDisAmt": 0,
"promoDisPct": 0,
"promoDiscountText": [],
"id": "DRN543"
},
"delivery": {
"price": 8.29,
"priceInc": 8.29,
"unitPriceDisplay": "£1.38 each",
"step": 1.0,
"max": 15.0,
"collection": false
},
"collection": {
"price": 7.29,
"priceInc": 7.29,
"unitPriceDisplay": "£1.21 each",
"step": 1.0,
"max": 15.0,
"collection": true
},
"previouslyPurchased": false,
"favourite": false,
"available": true,
"new": false,
"popular": true,
"popularOnCategory1": true,
"popularOnCategory2": true,
"popularOnCategory3": true,
"ageRestriction": false,
"halal": false,
"vegan": true,
"vegeterian": true,
"numberOfPackage": 6,
"numberOfUnitsInPackage": 1.5,
"unitType": "litre",
"CCMAltItemGroup": "",
"JJeBrand": "Coca Cola",
"JadConsumableDepth": 0.0,
"JadConsumableHeight": 0.0,
"JadConsumableWidth": 0.0,
"JJeCategory1Id": "100005",
"JJeCategory2Id": "200261",
"JJeCategory3Id": "300194",
"JJeCategory4Id": "400084",
"JJeCategory5Id": "",
"JJeCategory1": "Drinks",
"JJeCategory2": "Soft Drinks",
"JJeCategory3": "Fizzy Drinks",
"JJeCategory4": "Cola",
"JJeCategory5": "",
"JJeCookingInstruction": "Best served chilled.",
"JJeIngredients": "Carbonated Water, Sugar, Colour (Caramel E150d), Phosphoric Acid, Natural Flavourings including Caffeine.",
"JadIngredientsHTML": "Carbonated Water, Sugar, Colour (Caramel E150d), Phosphoric Acid, Natural Flavourings including Caffeine.",
"JJeOrigin": "United Kingdom",
"JadCatchOrigin": "",
"JJeProductDescription": "",
"JJeSellingPoints": "Coca Cola\nCocaCola",
"JJeShelfLife": 135,
"JJeSizeOrCut": "6x1.5L",
"JadAllergensDeclaration": "",
"JadStorageCondition": "Store cool and dry.",
"JJeEnergyKJ": 180.0,
"JJeEnergyKCAL": 42.0,
"JJeFatG": "0",
"JadFatSaturatesG": "0",
"JJeCarbohydrateG": "10.6",
"JadCarbohydrateSugarsG": "10.6",
"JJeProteinG": "0",
"JadSodiumG": "0",
"IsAgeRestriction": false,
"IsHalal": false,
"IsVegan": true,
"IsVegeterian": true
}
]
},
"_links": {
"maintenance-message": {
"href": "[]"
},
"announcement-message": {
"href": "[]"
}
},
"page": {
"size": 12,
"totalElements": 18,
"totalPages": 2,
"number": 0
}
}
public static List<JJs.ITEMS> JJSGetProductHTML(String Term)
{
string url = "https://www.website.com/api/product-search-agg/api/v1/product/websearch?b=DG-MW&page=0&q=" + Term+ "&size=12&sortType=search&format=json";
WebClient WC = new WebClient();
string JSON = WC.DownloadString(url);
var onject,= JsonConvert.DeserializeObject<dynamic>(JSON);
string s = onject._embedded.products[0].ToString();
List<JJs.ITEMS> products = new List<JJs.ITEMS>();
foreach (var m in onject._embedded.products)
{
JJs.ITEMS newitem = new JJs.ITEMS();
newitem.Name = m.jsonFriendlyItemName.ToString();
newitem.itemId = m.itemId.ToString();
newitem.price = m.collection.price.ToString();
try
{ m.offer.ToString(); newitem.Promoend = m.offer.promoEnd.ToString(); newitem.Promo = true; }
catch
{ newitem.Promo = false; };
products.Add(newitem);
}
return products;
}
As I didnt want all the feilds, i serialised it then used ".jsonFriendlyItemName",".itemId".. to only get the values I was looking for
thanks #JaromandaX, # Peter B and #dbc for your help

How do I get UNIQUE categories from all documents in CosmosDB?

I have millions of documents in CosmosDB using SQL API, and I need to find the unique categories from all documents.
The documents looks like follows, you can see the categories array just under the description, I dont care in what order they are I just need to know all the unique ones from all documents in the collection, I need this so that later on I can create queries on the categories but thats a later question I first need to get them all out so I know what all the possible options are, but I am unable to figure out the query to do this so that I get only the category names.
{
"id": "56d934d3-90bf-4f5a-b602-e515fefa599f",
"_id": "5bf6705f9568cf00013cd13c",
"vendor": "XXX",
"updatedAt": "2018-11-23T03:55:30.044Z",
"locales": [
{
"title": "Cold shoulder t-shirt",
"description": "Because collar bones. Trending cold shoulder t-shirt in 100% organic cotton. Classic, wide and boxy t-shirt fit with cut-out details. In black, because black tees and fashion are like this (insert friendly hand gesture). This style is online exclusive.",
"categories": [
"Women",
"clothing",
"tops"
],
"brand": null,
"images": [
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_102],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_203],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_301],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[02_0659881_001_101],type[PRODUCT],device[hdpi],quality[80],ImageVersion[1.0]&call=url[file:/product/main]"
],
"country": "SE",
"currency": "SEK",
"language": "en",
"variants": [
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XXS",
"color": "Black magic"
}
},
{
"artno": "xxx",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XS",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XL",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "S",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 1,
"attributes": {
"size": "M",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "L",
"color": "Black magic"
}
}
]
}
],
"_rid": "QEwcALNbIz8GAAAAAAAAAA==",
"_self": "dbs/QEwcAA==/colls/QEwcALNbIz8=/docs/QEwcALNbIz8GAAAAAAAAAA==/",
"_etag": "\"6a0003c6-0000-0000-0000-5bf7958c0000\"",
"_attachments": "attachments/",
"_ts": 1542952332
}
Please see my test, it could get all the unique categories names.
Sample document:
[
{
"id": "1",
"locales": [
{
"categories": [
"Women",
"clothing",
"tops"
]
}
]
},
{
"id": "2",
"locales": [
{
"categories": [
"Men",
"test",
"tops"
]
}
]
}
]
SQL:
SELECT distinct cat FROM c
join l in c.locales
join cat in l.categories
Output:
[
{
"cat": "Women"
},
{
"cat": "clothing"
},
{
"cat": "tops"
},
{
"cat": "Men"
},
{
"cat": "test"
}
]
If you don't want to case sensitive,just use LOWER function in sql.
SELECT distinct Lower(cat) FROM c
join l in c.locales
join cat in l.categories
If you want to get ["Women","clothing","tops","Men","test"], it can't be parsed as an array in single sql directly, you could use stored procedure to parse the output array.
For example, add below code in stored procedure.
var returnArray = [];
for(var i=0 ;i<array.size;i++){
returnArray.push(array[i].value)
}
return returnArray;

Selenium - json - c#

var pre1 = driver.FindElementByTagName("pre").Text.Replace(#"\", "").Trim();
dynamic root = JsonConvert.DeserializeObject(pre1);
I have this JSON response:
{
"success": true,
"message": null,
"outright": false,
"eventId": 0,
"si": 111,
"leonard": [{
"catalog":[0,0,0,0,0,0],
"edit": 25965112,
"mkilo": {
"888;315;2;3;0": {
"id": 1000,
"description": "Car"
},
"888;316;2;4;0": {
"id": 1001,
"description": "Train"
},
"888;317;2;5;0": {
"id": 1002,
"description": "Airplane"
}
},
"ti": "008000",
"checkin": 254,
"searchCar": {
"id": 1000,
"description": "Car"
}
}],
"ti": 149498
}
verified with jsonlint
root.leonard[0].catalog.Count = 6 ---- > OK
but
root.leonard[0].mkilo.Count = null - -- Why?
I want to read the contents of mkilo.

JsonSerializer Isn't serializing all data

Here's some simple code that deserializes a .json file then serializes it again, making no changes to the data.
JObject json = JObject.Parse(File.ReadAllText("fileIn.json"));
JsonWriter writer = new JsonTextWriter(new StreamWriter("fileOut.json", false));
writer.Formatting = Formatting.Indented;
JsonSerializer serializer = new JsonSerializer();
serializer.Serialize(writer, json);
Everything seems to be deserialized just fine as the json JObject contains all the data but strangely not everything is being serialized.
If this is fileIn.json:
{
"metadata":{
"vertices":56
},
"influencesPerVertex":2,
"bones":[{
"parent":-1,
"name":"torso",
"scl":[1,1,1],
"pos":[-2.42144e-08,0.720174,-0.00499988],
"rotq":[0.707107,0,-0,0.707107]
},{
"parent":0,
"name":"head",
"scl":[1,1,1],
"pos":[0,0,-0.904725],
"rotq":[0,0,-0,1]
},{
"parent":0,
"name":"leftLeg",
"scl":[1,1,1],
"pos":[0.173333,-4.05163e-05,-0],
"rotq":[1,-4.37114e-08,-0,0]
}],
"skinIndices":[1,2,3],
"vertices":[1,2,3],
"skinWeights":[1,2,3],
"faces":[1,2,3],
"normals":[1,2,3],
"uvs":[]
}
Then fileOut.json will look like this:
{
"metadata": {
"vertices": 56
},
"influencesPerVertex": 2,
"bones": [
{
"parent": -1,
"name": "torso",
"scl": [
1,
1,
1
],
"pos": [
-2.42144E-08,
0.720174,
-0.00499988
],
"rotq": [
0.707107,
0,
0,
0.707107
]
},
{
"parent": 0,
"name": "head",
"scl": [
1,
1,
1
],
"pos": [
0,
0,
-0.904725
],
"rotq": [
0,
0,
0,
1
]
},
{
"parent": 0,
"name": "leftLeg",
"scl": [
1,
1,
1
],
"pos": [
0.173333,
-4.05163E-05,
0
],
"rotq": [
1,
-4.37114E-08,
0,
0
]
}
],
"skinIndices": [
1,
2,
3
],
"vertices": [
1,
2,
3
As you can see the output file is missing data towards the end. Why is this happening and how can I fix it? Thanks
You don't close your output file (new StreamWriter("fileOut.json", false), this is why you don't see the whole file...
A simpler way for writing indented json back to file would be
JObject json = JObject.Parse(File.ReadAllText("fileIn.json"));
File.WriteAllText("fileOut.json", json.ToString(Newtonsoft.Json.Formatting.Indented));

How to count number of JSON elements in an array in C#

I'm worinkg with instagram API and when I'm receiving recent media with any hashtag by this template:
https://api.instagram.com/v1/tags/{hashtag}/media/recent
I'm receiving data like this:
{
"pagination": {
"next_max_tag_id": "any_number",
"deprecation_warning": "next_max_id and min_id are deprecated for this endpoint; use min_tag_id and max_tag_id instead",
"next_max_id": "any_number",
"next_min_id": "any_number",
"min_tag_id": "any_number",
"next_url": "https://api.instagram.com/v1/tags/{hashtag}/media/recent?access_token={my_personal_access-token}"
},
"meta": {
"code": 200
},
"data": [
{
"attribution": null,
"tags": [
"any_tag",
"any_tag1",
"any_tag2",
"any_tag3"
],
"type": "image",
"location": null,
"comments": {
"count": 0,
"data": []
},
"filter": "Normal",
"created_time": "any_number",
"link": "any_url",
"likes": {
"count": 0,
"data": []
},
"images": {
"low_resolution": {
"url": "any_url",
"width": 320,
"height": 320
},
"thumbnail": {
"url": "any_url",
"width": 150,
"height": 150
},
"standard_resolution": {
"url": "any_url",
"width": 640,
"height": 640
}
},
"users_in_photo": [],
"caption": {
"created_time": "any_number",
"text": "any_content",
"from": {
"username": "any_username",
"profile_picture": "any_url",
"id": "any_number",
"full_name": "any_full_name"
},
"id": "any_number"
},
"user_has_liked": false,
"id": "any_number",
"user": {
"username": "any_username",
"profile_picture": "any_url",
"id": "any_number",
"full_name": "any_full_name"
}
},
and so on.
As You can see, object "data" is an Array, and further we can see object "tags", which is also array. how can I check number elements array of array in C#? i tried like this:
JArray items = (JArray)jsonData["data[0].tags"];
int length = items.Count;
but it doesn't work. I parse JSON like this:
dynamic jsonData = JsonConvert.DeserializeObject<dynamic>(JSON_string);
var token = JToken.Parse(str);
var data = token.Value<JArray>("data");
var tags = data[0].Value<JArray>("tags");
var count = tags.Count;
You can also use a JsonPath:
var token = JToken.Parse(str);
var count = token.SelectTokens("$.data[0].tags[*]").Count();

Categories

Resources