Related
i am new on MongoDB and i am trying to use it in C# context. Let´s say, i have documents like this:
[
{
"Number": "2140007529",
"Name": "ABC",
"IsInactive": true,
"EntryList": [
{
"Timestamp": "2022-06-01T14:00:00.000+00:00",
"Value": 21564.0
},
{
"Timestamp": "2022-07-01T21:31:00.000+00:00",
"Value": 21568.0
},
{
"Timestamp": "2022-08-02T21:21:00.000+00:00",
"Value": 21581.642
},
{
"Timestamp": "2022-09-02T15:42:00.000+00:00",
"Value": 21593.551
},
{
"Timestamp": "2022-09-26T13:00:00.000+00:00",
"Value": 21603
}
]
},
{
"Number": "2220000784",
"Name": "XYZ",
"IsInactive": false,
"EntryList": [
{
"Timestamp": "2022-09-26T13:00:00.000+00:00",
"Value": 0.0
},
{
"Timestamp": "2022-10-01T08:49:00.000+00:00",
"Value": 5.274
},
{
"Timestamp": "2022-11-01T09:56:00.000+00:00",
"Value": 76.753
},
{
"Timestamp": "2022-12-01T19:43:00.000+00:00",
"Value": 244.877
},
{
"Timestamp": "2023-01-01T11:54:00.000+00:00",
"Value": 528.56
},
{
"Timestamp": "2023-02-01T17:21:00.000+00:00",
"Value": 802.264
}
]
}
]
I want to get the document where the IsInactive flag is false. But for the EntryList there should be returned entries greater than Timestamp 2022-12-31 only.I should look like this:
{
"Number": "2220000784",
"Name": "XYZ",
"IsInactive": false,
"EntryList": [
{
"Timestamp": "2023-01-01T11:54:00.000+00:00",
"Value": 528.56
},
{
"Timestamp": "2023-02-01T17:21:00.000+00:00",
"Value": 802.264
}
]
}
So, here is my question. How can i filter nested arrays in return value with C#. Thanks for help!
I tried to get the result with aggregate function of MongoDB in MongoDB Compass. I got it work with but not in C#.
I think you are looking for a query similar to this one.
So you can try something like this code:
var desiredTimestamp = new DateTime(2022, 12, 31);
var results = collection.AsQueryable()
.Where(x => x.IsInactive == false && x.EntryList.Any(e => e.Timestamp >= desiredTimestamp))
.Select(obj => new
{
Number = obj.Number,
Name = obj.Name,
IsInactive = obj.IsInactive,
EntryList = obj.EntryList
.Where(e => e.Timestamp >= desiredTimestamp)
.ToList()
}).ToList()
Note that I'm assuming your Timestamp is a date type, otherwise you can't compare date and string.
I have millions of documents in CosmosDB using SQL API, and I need to find the unique categories from all documents.
The documents looks like follows, you can see the categories array just under the description, I dont care in what order they are I just need to know all the unique ones from all documents in the collection, I need this so that later on I can create queries on the categories but thats a later question I first need to get them all out so I know what all the possible options are, but I am unable to figure out the query to do this so that I get only the category names.
{
"id": "56d934d3-90bf-4f5a-b602-e515fefa599f",
"_id": "5bf6705f9568cf00013cd13c",
"vendor": "XXX",
"updatedAt": "2018-11-23T03:55:30.044Z",
"locales": [
{
"title": "Cold shoulder t-shirt",
"description": "Because collar bones. Trending cold shoulder t-shirt in 100% organic cotton. Classic, wide and boxy t-shirt fit with cut-out details. In black, because black tees and fashion are like this (insert friendly hand gesture). This style is online exclusive.",
"categories": [
"Women",
"clothing",
"tops"
],
"brand": null,
"images": [
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_102],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_203],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_301],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[02_0659881_001_101],type[PRODUCT],device[hdpi],quality[80],ImageVersion[1.0]&call=url[file:/product/main]"
],
"country": "SE",
"currency": "SEK",
"language": "en",
"variants": [
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XXS",
"color": "Black magic"
}
},
{
"artno": "xxx",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XS",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XL",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "S",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 1,
"attributes": {
"size": "M",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "L",
"color": "Black magic"
}
}
]
}
],
"_rid": "QEwcALNbIz8GAAAAAAAAAA==",
"_self": "dbs/QEwcAA==/colls/QEwcALNbIz8=/docs/QEwcALNbIz8GAAAAAAAAAA==/",
"_etag": "\"6a0003c6-0000-0000-0000-5bf7958c0000\"",
"_attachments": "attachments/",
"_ts": 1542952332
}
Please see my test, it could get all the unique categories names.
Sample document:
[
{
"id": "1",
"locales": [
{
"categories": [
"Women",
"clothing",
"tops"
]
}
]
},
{
"id": "2",
"locales": [
{
"categories": [
"Men",
"test",
"tops"
]
}
]
}
]
SQL:
SELECT distinct cat FROM c
join l in c.locales
join cat in l.categories
Output:
[
{
"cat": "Women"
},
{
"cat": "clothing"
},
{
"cat": "tops"
},
{
"cat": "Men"
},
{
"cat": "test"
}
]
If you don't want to case sensitive,just use LOWER function in sql.
SELECT distinct Lower(cat) FROM c
join l in c.locales
join cat in l.categories
If you want to get ["Women","clothing","tops","Men","test"], it can't be parsed as an array in single sql directly, you could use stored procedure to parse the output array.
For example, add below code in stored procedure.
var returnArray = [];
for(var i=0 ;i<array.size;i++){
returnArray.push(array[i].value)
}
return returnArray;
I have a search page which contains two search result types: summary result and concrete result.
Summary result page contains top 3 result per category (top hits)
Concrete result page contains all result for a selected category.
To obtain the Summary page I use the request:
var searchDescriptor = new SearchDescriptor<ElasticType>();
searchDescriptor.Index("index_name")
.Query(q =>
q.MultiMatch(m => m
.Fields(fs => fs
.Field(f => f.Content1, 3)
.Field(f => f.Content2, 2)
.Field(f => f.Content3, 1))
.Fuzziness(Fuzziness.EditDistance(1))
.Query(query)
.Boost(1.1)
.Slop(2)
.PrefixLength(1)
.MaxExpansions(100)
.Operator(Operator.Or)
.MinimumShouldMatch(2)
.FuzzyRewrite(RewriteMultiTerm.ConstantScoreBoolean)
.TieBreaker(1.0)
.CutoffFrequency(0.5)
.Lenient()
.ZeroTermsQuery(ZeroTermsQuery.All))
&& (q.Terms(t => t.Field(f => f.LanguageId).Terms(1)) || q.Terms(t => t.Field(f => f.LanguageId).Terms(0))))
.Aggregations(a => a
.Terms("category", tagd => tagd
.Field(f => f.Category)
.Size(10)
.Aggregations(aggs => aggs.TopHits("top_tag_hits", t => t.Size(3)))))
.FielddataFields(fs => fs
.Field(p => p.Content1, 3)
.Field(p => p.Content2, 2)
.Field(p => p.Content3, 1));
var elasticResult = _elasticClient.Search<ElasticType>(_ => searchDescriptor);
And I get result, for example
{
"aggregations": {
"category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": "category1",
"doc_count": 40,
"top_tag_hits": {
"hits": {
"total": 40,
"max_score": 5.4,
"hits": [{
"_index": "...",
"_type": "...",
"_id": "...",
"_score": 5.4,
"_source": {
"id": 1
}
},
{
"_index": "...",
"_type": "...",
"_id": "...",
"_score": 4.3,
"_source": {
"id": 3 // FAIL!
}
},
{
"_index": "...",
"_type": "...",
"_id": "...",
"_score": 4.3,
"_source": {
"id": 2
}
}]
}
}
}]
}
}
}
So i get few hits with the same _score.
To obtain the concrete result (by category) page I use the request:
var searchDescriptor = new SearchDescriptor<ElasticType>();
searchDescriptor.Index("index_name")
.Size(perPage <= 0 ? 100 : perPage)
.From(page * perPage)
.Query(q => q
.MultiMatch(m => m
.Fields(fs => fs
.Field(f => f.Content1, 3)
.Field(f => f.Content2, 2)
.Field(f => f.Content3, 1)
.Field(f => f.Category))
.Fuzziness(Fuzziness.EditDistance(1))
.Query(searchRequest.Query)
.Boost(1.1)
.Slop(2)
.PrefixLength(1)
.MaxExpansions(100)
.Operator(Operator.Or)
.MinimumShouldMatch(2)
.FuzzyRewrite(RewriteMultiTerm.ConstantScoreBoolean)
.TieBreaker(1.0)
.CutoffFrequency(0.5)
.Lenient()
.ZeroTermsQuery(ZeroTermsQuery.All))
&& q.Term(t => t.Field(f => f.Category).Value(searchRequest.Category))
&& (q.Terms(t => t.Field(f => f.LanguageId).Terms(1)) || q.Terms(t => t.Field(f => f.LanguageId).Terms(0))))
.FielddataFields(fs => fs
.Field(p => p.Content1, 3)
.Field(p => p.Content2, 2)
.Field(p => p.Content3, 1))
.Aggregations(a => a
.Terms("category", tagd => tagd
.Field(f => f.Category)));
And the result something like this:
{
"hits": {
"total": 40,
"max_score": 7.816723,
"hits": [{
"_index": "...",
"_type": "...",
"_id": "...",
"_score": 7.816723,
"_source": {
"id": 1
}
},
{
"_index": "...",
"_type": "...",
"_id": "...",
"_score": 6.514713,
"_source": {
"id": 2
}
},
{
"_index": "...",
"_type": "...",
"_id": "...",
"_score": 6.514709,
"_source": {
"id": 3
}
}]
}
}
And so in the second case, for a specific category I get the _score with great precision and elastic can easily sort the results correctly. But in the case of aggregation there are results with the same _score, and in this case, the sorting is not clear how it works.
Can someone direct me to the right path how to solve this problem? or how can I achieve the same order in the results? Maybe I can increase the accuracy for the aggregated results?
I use elasticsearch server version "5.3.0" and NEST library version "5.0.0".
Update:
Native query for aggregation request:
{
"fielddata_fields": [
"content1^3",
"content2^2",
"content3^1"
],
"aggs": {
"category": {
"terms": {
"field": "category",
"size": 10
},
"aggs": {
"top_tag_hits": {
"top_hits": {
"size": 3
}
}
}
}
},
"query": {
"bool": {
"must": [
{
"multi_match": {
"boost": 1.1,
"query": "sparta",
"fuzzy_rewrite": "constant_score_boolean",
"fuzziness": 1,
"cutoff_frequency": 0.5,
"prefix_length": 1,
"max_expansions": 100,
"slop": 2,
"lenient": true,
"tie_breaker": 1.0,
"minimum_should_match": 2,
"operator": "or",
"fields": [
"content1^3",
"content2^2",
"content3^1"
],
"zero_terms_query": "all"
}
},
{
"bool": {
"should": [
{
"terms": {
"languageId": [
1
]
}
},
{
"terms": {
"languageId": [
0
]
}
}
]
}
}
]
}
}
}
Native query for concrete request:
{
"from": 0,
"size": 100,
"fielddata_fields": [
"content1^3",
"content2^2",
"content3^1"
],
"aggs": {
"category": {
"terms": {
"field": "category"
}
}
},
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"multi_match": {
"boost": 1.1,
"query": ".....",
"fuzzy_rewrite": "constant_score_boolean",
"fuzziness": 1,
"cutoff_frequency": 0.5,
"prefix_length": 1,
"max_expansions": 100,
"slop": 2,
"lenient": true,
"tie_breaker": 1.0,
"minimum_should_match": 2,
"operator": "or",
"fields": [
"content1^3",
"content2^2",
"content3^1",
"category"
],
"zero_terms_query": "all"
}
},
{
"term": {
"category": {
"value": "category1"
}
}
}
]
}
},
{
"bool": {
"should": [
{
"terms": {
"languageId": [
1
]
}
},
{
"terms": {
"languageId": [
0
]
}
}
]
}
}
]
}
}
}
Also i use next mapping for creating index:
var descriptor = new CreateIndexDescriptor(indexName)
.Mappings(ms => ms
.Map<ElasticType>(m => m
.Properties(ps => ps
.Keyword(s => s.Name(ecp => ecp.Title))
.Text(s => s.Name(ecp => ecp.Content1))
.Text(s => s.Name(ecp => ecp.Content2))
.Text(s => s.Name(ecp => ecp.Content3))
.Date(s => s.Name(ecp => ecp.Date))
.Number(s => s.Name(ecp => ecp.LanguageId).Type(NumberType.Integer))
.Keyword(s => s.Name(ecp => ecp.Category))
.Text(s => s.Name(ecp => ecp.PreviewImageUrl).Index(false))
.Text(s => s.Name(ecp => ecp.OptionalContent).Index(false))
.Text(s => s.Name(ecp => ecp.Url).Index(false)))));
_elasticClient.CreateIndex(indexName, _ => descriptor);
Your query has problems.
What you are using is combination of must and should inside a must as part of bool query.
So if you read more in this link, you can see for must
The clause (query) must appear in matching documents and will contribute to the score.
so it will five equal scoring to all your documents which matched the condition. Any other condition which didn't match the condition won't even be there in results to score.
What you should do it use should query but outside of must query, so Elasticsearch will be able to score your documents correctly
For more info as part of this question
Can someone direct me to the right path how to solve this problem?
you should pass 'explain': true in the query. You can read more about explain query and how to interpret results in this link.
You answer for this question is
how can I achieve the same order in the results?
As every score is same therefore Elasticsearch can sort the result in any way it gets the response from its nodes.
Possible Solution:
You should reorganize your query to make real use of should query and its boosting capabilities. You can read more about boosting here.
I tried two query similar to yours but with correct usage of should and they gave me same order as expected. Your both query should be constructed as below:
{
"from": 0,
"size": 10,
"_source": [
"content1^3",
"content2^2",
"content3^1"
],
"query": {
"bool": {
"should": [
{
"match": {
"languageId": 1
}
},
{
"match": {
"languageId": 0
}
}
],
"must": [
{
"multi_match": {
"boost": 1.1,
"query": ".....",
"fuzzy_rewrite": "constant_score_boolean",
"fuzziness": 1,
"cutoff_frequency": 0.5,
"prefix_length": 1,
"max_expansions": 100,
"slop": 2,
"lenient": true,
"tie_breaker": 1,
"minimum_should_match": 2,
"operator": "or",
"fields": [
"content1^3",
"content2^2",
"content3^1",
"category"
],
"zero_terms_query": "all"
}
}
]
}
}
}
and second query as
{
"size": 0,
"query": {
"bool": {
"should": [
{
"match": {
"languageId": 1
}
},
{
"match": {
"languageId": 0
}
}
],
"must": [
{
"multi_match": {
"boost": 1.1,
"query": ".....",
"fuzzy_rewrite": "constant_score_boolean",
"fuzziness": 1,
"cutoff_frequency": 0.5,
"prefix_length": 1,
"max_expansions": 100,
"slop": 2,
"lenient": true,
"tie_breaker": 1,
"minimum_should_match": 2,
"operator": "or",
"fields": [
"content1^3",
"content2^2",
"content3^1",
"category"
],
"zero_terms_query": "all"
}
}
]
}
},
"aggs": {
"categories": {
"terms": {
"field": "category",
"size": 10
},
"aggs": {
"produdtcs": {
"top_hits": {
"_source": [
"content1^3",
"content2^2",
"content3^1"
],
"size": 3
}
}
}
}
}
}
I created an elastic search index and the result of a simple search looks like:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 11,
"max_score": 1,
"hits": [
{
"_index": "shop-bestellung",
"_type": "bestellung",
"_id": "dc144b04-8e73-4ea5-9f73-95c01768fd26",
"_score": 1,
"_source": {
"id": "dc144b04-8e73-4ea5-9f73-95c01768fd26",
"bestellnummer": "B-20170302-026",
"shopid": "0143d767-8986-432a-a15d-00e1c4862b24",
"shopname": "DeeDa",
"erstelltVon": "5663bb4b-fc44-46ca-b875-a3487b588b24",
"bestellername": "Max Mann",
"bestelldatum": "2017-01-30T23:00:00Z",
"bestellpositionen": []
}
}
]
}
}
I tried to create a filter which should consits of following three restrictions:
Query text
Date range
Filter on a specific field: "erstelltVon"
My filter only consits of query text and date range:
{
"query":{
"query_string":{
"fields":[
"bestellnummer",
"bestellername",
"bestelldatum",
"erstelltVon",
"bestellpositionen.artikelname",
"bestellpositionen.artikelnummer",
"bestellpositionen.referenznummer"
],
"query":"*"
}
},
"filter": {
"range" : {
"bestelldatum" : {
"gte": "2017-02-04T23:00:00Z",
"lte": "now",
"time_zone": "+01:00"
}
}
}
}
I would like to add the third filter:
"erstelltVon": "5663bb4b-fc44-46ca-b875-a3487b588b24"
How can I do that?
You need to use a boolean filter.
Here is how to use it:
"filter": {
"bool" : {
"must": [
// FIRST FILTER
{
"range" : {
"bestelldatum" : {
"gte": "2017-02-04T23:00:00Z",
"lte": "now",
"time_zone": "+01:00"
}
}
},
{
// YOUR OTHER FILTER HERE
}
]
}
change "must" to "should" if you want to use a OR instead of an AND.
I have been searching here and I didn't find anything similar... However I apologize in advanced if it have escaped me, and I hope you can help out finding the correct direction.
I was looking for a way to implement the following in NEST C#:
"aggs": {
"sys_created_on_max": {
"max": {
"field": "sys_created_on"
}
},
"sys_created_on_min":{
"min": {
"field": "sys_created_on"
}
},
"sys_updated_on_max": {
"max": {
"field": "sys_updated_on"
}
},
"sys_updated_on_min":{
"min": {
"field": "sys_updated_on"
}
}
}
Meaning that I want to perform, in the same statement:
Max and Min aggregated value for "sys_created_on" field
and also
Max and Min aggregated value for "sys_updated_on" field
Thanks!
What you want is Stats Aggregation.
Here is an example input/output
INPUT
GET devdev/redemption/_search
{
"size": 0,
"aggs": {
"a1": {
"stats": {
"field": "reporting.campaign.endDate"
}
}
}
}
Result
{
"took": 97,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 146,
"max_score": 0,
"hits": []
},
"aggregations": {
"a1": {
"count": 11,
"min": 1443675599999,
"max": 1446353999999,
"avg": 1445607818180.818,
"sum": 15901685999989,
"min_as_string": "1443675599999",
"max_as_string": "1446353999999",
"avg_as_string": "1445607818180",
"sum_as_string": "15901685999989"
}
}
}
I've figured it out. In case of someone have the same doubt:
1) create a AggregationContainerDescriptor:
Func<AggregationContainerDescriptor<dynamic>, IAggregationContainer> aggregationsSelector = null;
2) Fill it up:
foreach (var field in requestList)
{
aggregationsSelector += ms => ms.Max(field.MaxAggregationAlias, mx => mx.Field(field.Name))
.Min(field.MinAggregationAlias, mx => mx.Field(field.Name));
}
3) Query it:
var esResponse = _esClient.Raw.Search<dynamic>(indexName.ToLower(), new PostData<dynamic>(jsonStr), null);
Cheers!