I have large number of documents (~2 million). I have to query for highest num for a given dictionary Key.
Expected output from below two documents would be '3' for given key '85642768'.
Doc1:
{
"id": "fb41ecd9-2761-41f4-87ca-aa4bf59f9d83",
"Key": "Key123",
"Dict": {
"85642766": {
"num": 1,
"str": "str1"
},
"85642767": {
"num": 1,
"str": "str2"
},
"85642768": {
"num": 2,
"str": "str3"
}
}
}
Doc2:
{
"id": "821cf017-421a-422a-b082-45e77228dca3",
"Key": "Key456",
"Dict": {
"85642766": {
"num": 1,
"str": "str1"
},
"85642767": {
"num": 2,
"str": "str2"
},
"85642768": {
"num": 3,
"str": "str3"
}
}
}
Currently I have 250K unique documents. I am downloading list of documents based on list of Key in batches of 8000. It takes 14 mins with 10000 RUs in same location.
I have an index based on Key also. Is there any better way to reduce search time?
As comment above says,you can use sql do this thing.
Please try something like this sql:
SELECT value max(c.Dict["85642768"].num) from c
Hope this can help you.
Related
I'm fairly new to C# and a lot of the libraries, but I'm trying to deserialize a json file that has a bunch of non-dynamic fields followed by a bunch of dynamically named structures with more dynamically sized/named structures underneath them.
Example of json:
"Header": {
"date": "Aug 30 2020",
"time": "14:00:00"
},
...
Other fields that will remain static
...
// Start of fields that can grow/shrink and be added to/deleted from
"Birthday": {
"MessageIndex": 0,
"Info Array Size": 3,
"Info": {
"Date": 30,
"Month": 8,
"Year": 2020
},
},
"Time of Day": {
"MessageIndex": 1,
"Info Array Size": 3,
"Info": {
"Hour": 2,
"Minute": 0,
"Second": 0
},
},
"Height": {
"MessageIndex": 2,
"Info Array Size": 2,
"Info": {
"Feet": 6,
"Inches": 2
},
},
// Continues on for any number of messages
I'm looking to create a tool to be able to parse and document what is in this information without having to add a JsonConverter for each new thing that gets added. Instead I'm looking for something that can
parse any number of items (like "Birthday", "Time of Day", "Height") that have an unknown name and a field named "Info" underneath it with an unknown number of fields and each of those fields name is unknown at the time of running the tool.
I was looking to use JSON.net (or any other available library really), but if that doesn't work I'm fine parsing the file manually. I'd just rather avoid that if possible. Thanks!
After a bunch of research I haven't been able to find any good ways to read a json file, store their values, then append a new object/array to it.
The JSON looks like
{
"Skywars": [
{
"Solo Normal": [
{
"000001": [
{
"Kills": 213,
"Deaths": 117
}
]
}
],
"Solo Insane": [
{
"000001": [
{
"Kills": 10790,
"Deaths": 7184
}
]
}
]
}
],"Bedwars": [
{
"Solo": [
{
"000001": [
{
"Kills": 0,
"Deaths": 0
}
]
}
],
"Duos": [
{
"000001": [
{
"Kills": 0,
"Deaths": 0
}
]
}
]
}
]
}
As an example, i am intending for it to go "Skywars.Solo Normal", "Skywars.Solo Insane", "Bedwars.Solo", "Bedwars.Duos" then append "000002" with new kills and deaths values.
For some reason, even after hours of searching, I can't find out how to read the kills and deaths (I've gotten close, using public Skywars[] Skywars { get;set; }. Problem is most of the examples are using JSON files that look like {"user":[{"id":1,"logins":0}]} with very little arrays & sub arrays.
To anyone who is kind enough to answer, please don't spoonfeed me code, explain how it would be done (would I need to create my own parser, etc), or if there are already any posts/links that answer my question (even though I failed to find how).
Notes -
"000001" and "000002" will be dynamic, so each time you start the program those values will be different. I just want to append after the last instance of stats saved.
Also sorry, I am still learning C# but know most of the basics and some more complex concepts, I've just never been good at storing data and using JSON. If you need anything to help else just add a comment and I'll add it.
Please see Json.NET's Modifying JSON.
This sample loads JSON, modifies JObject and JArray instances and then writes the JSON back out again.
Sample
string json = #"{
'channel': {
'title': 'Star Wars',
'link': 'http://www.starwars.com',
'description': 'Star Wars blog.',
'obsolete': 'Obsolete value',
'item': []
}
}";
JObject rss = JObject.Parse(json);
JObject channel = (JObject)rss["channel"];
channel["title"] = ((string)channel["title"]).ToUpper();
channel["description"] = ((string)channel["description"]).ToUpper();
channel.Property("obsolete").Remove();
channel.Property("description").AddAfterSelf(new JProperty("new", "New value"));
JArray item = (JArray)channel["item"];
item.Add("Item 1");
item.Add("Item 2");
Console.WriteLine(rss.ToString());
// {
// "channel": {
// "title": "STAR WARS",
// "link": "http://www.starwars.com",
// "description": "STAR WARS BLOG.",
// "new": "New value",
// "item": [
// "Item 1",
// "Item 2"
// ]
// }
// }
I'm using NEST library in .NET for query. I have a mapped a property as CurrentProductStatus(a string). In a document sample record for CurrentProductStatus is like this :
"OldStatus|Scanned: [PURCHASE] Recieved|0|#f6f6f6"
So i have to filter the result with the second pipeline separated string(ex: "Scanned: [PURCHASE] Recieved"). I have tried with Standard Analyzer. Didnt work for me.
QueryContainer query = null;
query &= Query<SearchProduct>.Match(m => m.Field(f => f.CurrentProductStatus).Query(searchParameters.ProductStatus).Analyzer("standard"));
This is part of my code, this is where i querying.
Any idea to do the searching and filtering ??
Using Ingest node with a pipeline looks like it may be a good fit here, to parse separate fields out of the CurrentProductStatus field, and add them to the document that is indexed. You can then query on these fields.
The alternative is to define an analyzer that tokenizes the input in a way conducive to your use case. For example, the Standard Analyzer is going to tokenize in the following way
GET _analyze
{
"text": ["OldStatus|Scanned: [PURCHASE] Recieved|0|#f6f6f6"]
}
---
{
"tokens": [
{
"token": "oldstatus",
"start_offset": 0,
"end_offset": 9,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "scanned",
"start_offset": 10,
"end_offset": 17,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "purchase",
"start_offset": 20,
"end_offset": 28,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "recieved",
"start_offset": 30,
"end_offset": 38,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "0",
"start_offset": 39,
"end_offset": 40,
"type": "<NUM>",
"position": 4
},
{
"token": "f6f6f6",
"start_offset": 42,
"end_offset": 48,
"type": "<ALPHANUM>",
"position": 5
}
]
}
With the Standard Analyzer applied at query time to Scanned: [PURCHASE] Recieved,
GET _analyze
{
"text": ["Scanned: [PURCHASE] Recieved"]
}
----
{
"tokens": [
{
"token": "scanned",
"start_offset": 0,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "purchase",
"start_offset": 10,
"end_offset": 18,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "recieved",
"start_offset": 20,
"end_offset": 28,
"type": "<ALPHANUM>",
"position": 2
}
]
}
Three tokens will match for a match query.
I have used "/revisions" to get the all version of file.But I am not getting actual versions of file by sequentially. (getting it randomly like 1,4,11,15)
Please see the below response of file containing 2 version.
For this I expecting 1,2
[
{
"revision": 4,
"rev": "40000000d",
"thumb_exists": false,
"bytes": 0,
"modified": "Wed, 20 Jul 2011 22:41:09 +0000",
"path": "/hi2",
"is_dir": false,
"icon": "page_white",
"root": "app_folder",
"mime_type": "application/octet-stream",
"size": "0 bytes"
},
{
"revision": 1,
"rev": "10000000d",
"thumb_exists": false,
"bytes": 3,
"modified": "Wed, 20 Jul 2011 22:40:43 +0000",
"path": "/hi2",
"is_dir": false,
"icon": "page_white",
"root": "app_folder",
"mime_type": "application/octet-stream",
"size": "3 bytes"
}
]
Here is my sample code:
OAuthUtility.GetAsync
(
"https://api.dropboxapi.com/1/revisions/auto/",
new HttpParameterCollection
{
{ "path", CurrentPath },
{ "access_token",accessToken },
{ "rev_limit", 1000 }
},
callback: GetFilesRevisions_Results
);
Can you please help me ?
Thanks in advance!
The revision field is deprecated and shouldn't be used. You should use the rev field instead. The rev is not a number and should be treated as opaque.
When you call /revisions, you get the revisions in reverse chronological order, so the first one is the most recent revision.
I have two json string and also posted here. First json string convert from c# data table using newtonsoft dll. The second one is manual string. If i use the second string means chart displayed well. First one means chart not displayed. I just found the error "value" and "y" like a string in first json string. Kindly help me to change the first one to second one.
1)
[
{
"name": "CHE-CORPORATE",
"value": "42",
"y": "11.8"
},
{
"name": "CHE-TELUGU",
"value": "123",
"y": "10.8"
},
{
"name": "CHE-MALAYALAM",
"value": "13",
"y": "23.8"
}
]
2)
[
{ "name": "CHE-TELUGU",
"value": 123,
"y": 10.8
},
{
"name": "CHE-CORPORATE",
"value": 45,
"y": 40.8
},
{
"name": "CHE-MALAYALAM",
"value": 155,
"y": 12.8
}
]
Just convert the string to number
$.each(data,function(key,val){
val.value=+val.value; // convert the string to number
val.y=+val.y;
});
console.log(data);
Demo