how to boost nested object with NEST

how to boost nested object with NEST - c#

I have mapped my index as below. Simply I have a product index with properties Id, Number,ManufactureNumber,shortDescription, name and nested object like SubProduct
"mappings": {
"Product": {
"properties": {
"id": { "index": "no","store": true,"type": "integer"},
"name": { "store": true,"type": "string"},
"image": { "properties": { "fileName": { "index": "no","store": true,"type": "string"},"virtualPath": { "index": "no","store": true,"type": "string"}}},
"number": { "index": "not_analyzed","store": true,"type": "string"},
"manufactureNumber": { "index": "not_analyzed","store": true,"type": "string"},
"subProduct": { "type": "nested","properties": { "name": { "store": true,"type": "string"},"number": { "index": "not_analyzed", "store": true,"type": "string"},"Id": { "index": "no","store": true,"type": "integer"}}}
}
}
}
what I want here is to search a keyword within properties of name, Number, ManufactureNumber,shortDescription and SubProduct.name, SubProduct.number. So if keyword is found any of this document should be returned with following priorities (scores)
manufactureNumber = 5.0
number =4.0
name=2.0
subproduct.number=2.0
subproduct.name =1.0
Based on those requirements and after my research, I thought multimatch query with boosting is my only option. Am I correct on this? or any query_string query can do it as well?
This is how I tried but I got stuck in the part with the nested object. I dont know how to boost them? below code will give error that number and name are not properties of subproduct.
var results = Client.Search<Product> (body => body.Query(query => query.MultiMatch(qs =>
qs.OnFieldsWithBoost(d => d.Add(entry => entry.manufactureNumber , 5.0)
.Add(entry => entry.number , 4.0)
.Add(entry => entry.name,3.0)
.Add(entry => entry.subproduct.number,2.0)
.Add(entry => entry.subproduct.name,1.0)
).Type(TextQueryType.BestFields).Query(key))));

You need to use Nested Query when querying on nested fields. In your case, they are subProduct.number and subProduct.name. The query you might be interested in is as under:
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"type": "best_fields",
"query": "key",
"fields": [
"manufactureNumber^5",
"number^4",
"name^3"
]
}
},
{
"nested": {
"query": {
"multi_match": {
"type": "best_fields",
"query": "key",
"fields": [
"subProduct.number^2",
"subProduct.name^1"
]
}
},
"path": "subProduct"
}
}
]
}
}
}
The corresponding Nest query is as under:
var results = client.Search<Product>(s => s
.Query(q => q
.Bool(b => b
.Should(
sh => sh.MultiMatch(qs => qs
.OnFieldsWithBoost(d => d
.Add("manufactureNumber", 5.0)
.Add("number", 4.0)
.Add("name", 3.0))
.Type(TextQueryType.BestFields)
.Query(key)),
sh => sh.Nested(n => n
.Path("subProduct")
.Query(nq => nq
.MultiMatch(qs => qs
.OnFieldsWithBoost(d => d
.Add("subProduct.number", 2.0)
.Add("subProduct.name", 1.0))
.Type(TextQueryType.BestFields)
.Query(key))))))));

Related

Applying Term, Filter and Aggregation in one Es Query using Nest

I'm using ElasticSearch's Nest 7.17.x to try and applying a Term, Filter and Aggregation all on single query.
The Term is a document id restriction which should restrict all following filters and aggregations. The Filter is to restriction the price to a specific range. The aggregation to is to make price buckets.
In the example below, is looks like both the price filter and term restriction arent being applied as I'm being returned documents with ids outside of the restriction and with prices larger than the filter.
var orderChangedArgFilter = client.Search<Product>(s => s
.Query(q => +q.Terms(p => p.Field("id").Terms(productIds)))
.Aggregations(aggs => aggs
.Filter("user filter with aggs", f => f
.Filter(q => q.Range(rf => rf.Field("price").GreaterThanOrEquals(0.01).LessThan(50.0)))
.Aggregations(childAggs => childAggs
.Range("0 to 50 price agg", r => r.Field("price").Ranges(rs => rs.From(0.0).To(50.0)))
.Range("50 to 100 price agg", r => r.Field("price").Ranges(rs => rs.From(50.0).To(100.0)))
.Range("100 to 150 price agg", r => r.Field("price").Ranges(rs => rs.From(100.0).To(150.0)))
)
)
)
);
How do I correct the query to have the Term restriction apply first and then the filter on top of it?
Edit 1:
It looks like the document Id term restriction is working as expected but the filter is not.
My engine was created via AppSearch. Looks like the datatype for the price field is actually text instead of number (on the abstracted elastic engine), even though I've specified number on the AppSearch engine. This seems to be the cause of the problem.
The raw index mappings
{
".ent-search-engine-documents-luke-test": {
"mappings": {
"dynamic": "true",
"properties": {
"price": {
"fields": {
"prefix": {
"search_analyzer": "q_prefix",
"type": "text",
"analyzer": "i_prefix",
"index_options": "docs"
},
"enum": {
"ignore_above": 2048,
"type": "keyword"
},
"float": {
"ignore_malformed": true,
"type": "double"
},
"joined": {
"search_analyzer": "q_text_bigram",
"type": "text",
"analyzer": "i_text_bigram",
"index_options": "freqs"
},
"stem": {
"type": "text",
"analyzer": "iq_text_stem"
},
"delimiter": {
"type": "text",
"analyzer": "iq_text_delimiter",
"index_options": "freqs"
},
"location": {
"ignore_malformed": true,
"type": "geo_point",
"ignore_z_value": false
},
"date": {
"ignore_malformed": true,
"type": "date",
"format": "strict_date_time||strict_date"
}
},
"type": "text",
"analyzer": "iq_text_base",
"index_options": "freqs"
},
"id": {
"type": "keyword"
},
"title": {
"fields": {
"prefix": {
"search_analyzer": "q_prefix",
"type": "text",
"analyzer": "i_prefix",
"index_options": "docs"
},
"enum": {
"ignore_above": 2048,
"type": "keyword"
},
"float": {
"ignore_malformed": true,
"type": "double"
},
"joined": {
"search_analyzer": "q_text_bigram",
"type": "text",
"analyzer": "i_text_bigram",
"index_options": "freqs"
},
"stem": {
"type": "text",
"analyzer": "iq_text_stem"
},
"delimiter": {
"type": "text",
"analyzer": "iq_text_delimiter",
"index_options": "freqs"
},
"location": {
"ignore_malformed": true,
"type": "geo_point",
"ignore_z_value": false
},
"date": {
"ignore_malformed": true,
"type": "date",
"format": "strict_date_time||strict_date"
}
},
"type": "text",
"analyzer": "iq_text_base",
"index_options": "freqs"
}
},
"dynamic_templates": [
{
"permissions": {
"mapping": {
"type": "keyword"
},
"match": "_*_permissions"
}
},
{
"thumbnails": {
"mapping": {
"type": "binary"
},
"match": "_thumbnail_*"
}
},
{
"data": {
"match_mapping_type": "*",
"mapping": {
"fields": {
"enum": {
"ignore_above": 2048,
"type": "keyword"
},
"float": {
"ignore_malformed": true,
"type": "double"
},
"delimiter": {
"type": "text",
"index_options": "freqs",
"analyzer": "iq_text_delimiter"
},
"joined": {
"search_analyzer": "q_text_bigram",
"type": "text",
"index_options": "freqs",
"analyzer": "i_text_bigram"
},
"prefix": {
"search_analyzer": "q_prefix",
"type": "text",
"index_options": "docs",
"analyzer": "i_prefix"
},
"location": {
"ignore_malformed": true,
"type": "geo_point",
"ignore_z_value": false
},
"date": {
"ignore_malformed": true,
"type": "date",
"format": "strict_date_time||strict_date"
},
"stem": {
"type": "text",
"analyzer": "iq_text_stem"
}
},
"type": "text",
"index_options": "freqs",
"analyzer": "iq_text_base"
}
}
}
]
}
}
}
Is it possible to still use the Nest client against an engine created via AppSearch?

My query had a couple of issues. Firstly I should have been using field name price.float which is the numeric representation of the price field on ES vs AppSearch. You could figure this out from the Explain endpoint offered by AppSearch.
The next issue was the structure of my query, I shouldn't have been using a FilteredAggregation. My intent was to not apply the filter specifically onto the aggregation, but to the query result.
Lastly, retrieving the results from the Aggregations was not that straight forward. Dumping the contents to console falsely shows an empty aggregation, however upon investigating the debugging info via the .EnableDebugMode() header, I was able to see that I was indeed receiving results. Applying a breakpoint and inspecting the results of the object allowed me to retrieve the object I wanted.
var orderChangedArgFilter = client.Search<Product>(s => s
.Query(q => +q.Range(rf => rf.Field("price.float").GreaterThanOrEquals(0.01).LessThan(50.0)) && +q.Terms(p => p.Field("id").Terms(productIds)))
.Aggregations(a => a
.Range("Price aggs", r => r
.Field("price.float")
.Ranges(rs =>
rs.From(0.01).To(50.0).Key("0 to 50 price agg"),
rs => rs.From(50.01).To(100).Key("50 to 100 price agg"),
rs => rs.From(100.01).Key("From 100 price agg")))
)
);
orderChangedArgFilter.Aggregations.Dump();
// vs
var agg = (Nest.BucketAggregate)orderChangedArgFilter.Aggregations.GetValueOrDefault("Price aggs");
var buckets = agg.Items.Select(b => (RangeBucket)b).ToList();
Console.WriteLine(buckets[0].Key + ", " + buckets[0].DocCount);
Note the line of text at the bottom of the below screen shot showing the Key and DocCount

Flatten Json & Ignore array index Using ChoETL

I have 1 json file and these lines of code:
Here's my code:
using (var r = new ChoJSONReader("data.json")
.Configure(c => c.ThrowAndStopOnMissingField = true)
.Configure(c => c.DefaultArrayHandling = true)
.Configure(c => c.FlattenNode = true)
.Configure(c => c.IgnoreArrayIndex = false)
.Configure(c => c.NestedKeySeparator = '.')
.Configure(c => c.NestedColumnSeparator = '.')
)
{
var dt = r.AsDataTable();
Console.WriteLine(dt.DumpAsJson());
}
My data.json file:
{
"BrandId": "998877665544332211",
"Categories": [
"112233445566778899"
],
"Contact": {
"Phone": [
{
"Value": "12346789",
"Description": {
"vi": "Phone"
},
"Type": 1
},
{
"Value": "987654321",
"Description": {
"vi": "Phone"
},
"Type": 1
}
]
}
}
After running this code, I got the output like this:
[
{
"BrandId": "998877665544332211",
"Contact.Phone.0.Value": "12346789",
"Contact.Phone.0.Description.vi": "Phone",
"Contact.Phone.0.Type": 1,
"Contact.Phone.1.Value": "987654321",
"Contact.Phone.1.Description.vi": "Phone",
"Contact.Phone.1.Type": 1,
"Category0": "112233445566778899"
}
]
The question here is how can I get some kind of output json without "0" at the flattened key node
Expected output:
[
{
"BrandId": "998877665544332211",
"Contact.Phone.Value": "12346789",
"Contact.Phone.Description.vi": "Phone",
"Contact.Phone.Type": 1,
"Category": "112233445566778899"
},
{
"BrandId": "998877665544332211",
"Contact.Phone.Value": "987654321",
"Contact.Phone.Description.vi": "Phone",
"Contact.Phone.Type": 1,
"Category": "112233445566778899"
}
]
I've research by many ways but it's results doesn't as same as my expected result.
Thanks for any kind of help

As your json is nested/complex in nature, you need to unpack and flatten into multiple simple data element rows using ChoETL/Linq as below.
ChoETLSettings.KeySeparator = '-';
using (var r = ChoJSONReader.LoadText(json)
.WithField("BrandId")
.WithField("Category", jsonPath: "Categories[0]", isArray: false)
.WithField("Phone", jsonPath: "Contact.Phone[*]")
)
{
var dt = r.SelectMany(rec => ((IList)rec.Phone).OfType<dynamic>().Select(rec1 =>
{
dynamic ret = new ChoDynamicObject();
ret["BrandId"] = rec.BrandId;
ret["Contact.Phone.Value"] = rec1.Value;
ret["Contact.Phone.Description.vi"] = rec1.Description.vi;
ret["Contact.Phone.Type"] = rec1.Type;
ret["Category"] = rec.Category;
return ret;
})).AsDataTable();
dt.DumpAsJson().Print();
}
Sample fiddle: https://dotnetfiddle.net/PHK8LO

Querying a subfield in documentdb

For example I have a document below for collection = delivery:
{
"doc": [
{
"docid": "15",
"deliverynum": "123",
"text": "txxxxxx",
"date": "2019-07-18T12:37:58Z"
},
{
"docid": "16",
"deliverynum": "456",
"text": "txxxxxx",
"date": "2019-07-18T12:37:58Z"
},
{
"docid": "17",
"deliverynum": "999",
"text": "txxxxxx",
"date": "2019-07-18T12:37:58Z"
}
],
"id": "123",
"cancelled": false
}
is it possible to do a search with "deliverynum" = 999 and the output would be like below?
{
"doc": [
{
"docid": "17",
"deliverynum": "999",
"text": "txxxxxx",
"date": "2019-07-18T12:37:58Z"
}
],
"id": "123",
"cancelled": false
}
or should I make another Collection just for the Doc part?
I am having trouble making a query in C# for this kind of scenario.

In Mongo shell you can use the $(projection) operator:
db.collection.find({ "doc.deliverynum": "999" }, { "doc.$": 1 })
Corresponding C# code can look like below:
var q = Builders<Model>.Filter.ElemMatch(x => x.doc, d => d.deliverynum == "999");
var p = Builders<Model>.Projection.ElemMatch(x => x.doc, d => d.deliverynum == "999");
var data = Col.Find(q).Project(p).ToList();
You can also use q = Builders<Model>.Filter.Empty if you want to get all documents even if the don't contain deliverynum =``999

Group by inner list data

I am facing an issue while writing query to make a group by on inner list data to filter the outer list.
I have a collection structure like
"products"
{
"id": "97",
"name": "YI1",
"projects": [
{
"id": "92",
"name": "MUM",
"branches": [
{
"id": "62",
"name": "ON Service",
"geographyid": "84",
"geographyname": "North America",
"countryid": "52",
"countryname": "Canada"
}
],
"customers": [
{
"id": "80",
"name": "HEALTH SCIENCES"
}
]
}
],
},
"products"
{
"id": "96",
"name": "YI2",
"projects": [
{
"id": "94",
"name": "HHS",
"branches": [
{
"id": "64",
"name": "Hamilton ON Service",
"geographyid": "44",
"geographyname": "Asia",
"countryid": "58",
"countryname": "China"
}
],
"customers": [
{
"id": "40",
"name": "SCIENCES"
}
]
}
],
]
}
I am trying to have a new collection which can return an output as below
"Geography"{
"geographyid": "44",
"geographyname": "Asia",
"Country"
{
"countryid": "58",
"countryname": "China",
"branches"
{
"id": "94",
"name": "HHS
"customers"
{
"id": "40",
"name": "SCIENCES"
"projects"
{
"id": "94",
"name": "HHS",
"products"
{
"id": "96",
"name": "YI2",
}
}
},
}
}
},
"Geography"{
"geographyid": "84",
"geographyname": "North America"
"Country"
{
"countryid": "52",
"countryname": "Canada"
"branches"
{
"id": "62",
"name": "ON Service",
"customers"
{
"id": "80",
"name": "HEALTH SCIENCES"
"projects"
{
"id": "92",
"name": "MUM",
"products"
{
"id": "97",
"name": "YI1",
}
}
},
}
}
}
I tried multiple options and also write below query but I am still not getting required result.
var treeGroup = siteList.SelectMany(a => a.projects.Select(b => new { A = a, B = b }).ToList()).ToList()
.GroupBy(ol => new { ol.B.geographyid, ol.B.geographyname })
.Select(gGroup => new TreeNodes
{
id = gGroup.Key.geographyid,
name = gGroup.Key.geographyname,
type = Costants.geographyTreeNode,
parentid = string.Empty,
children = gGroup
.GroupBy(ol => new { ol.B.countryid, ol.B.countryname })
.Select(cGroup => new TreeNodes
{
id = cGroup.Key.countryid,
name = cGroup.Key.countryname,
type = Costants.countryTreeNode,
parentid = gGroup.Key.geographyid,
children = cGroup
.GroupBy(ol => new { ol.B.id, ol.B.name })
.Select(sGroup => new TreeNodes
{
id = sGroup.Key.id,
name = sGroup.Key.name,
type = Costants.branchTreeNode,
parentid = cGroup.Key.countryid,
children = sGroup
.Select(ol => new TreeNodes { id = ol.A.id, name = ol.A.name, type = Costants.siteTreeNode, parentid = sGroup.Key.id, children = new List<TreeNodes>() })
.ToList()
})
.ToList()
})
.ToList()
})
.ToList();
I can use looping logic to get the result, but I want to avoid it and try something with linq or lmbda expression.

I am able to resolve the issue. I took an additional parameters for Customers and then used the same with selectmany function on branches

SuggestCompletion Nest usage

I'm trying to do a SuggestCompletion query for a location (countries and cities), I'd like to perform the query over those two fields.
my mapping so far is the following:
var response = _client.CreateIndex(PlatformConfiguration.LocationIndexName,
descriptor => descriptor.AddMapping<LocationInfo>(
m => m.Properties(
p => p.Completion(s => s
.Name(n=>n.CountryName)
.IndexAnalyzer("simple")
.SearchAnalyzer("simple")
.MaxInputLength(50)
.Payloads()
.PreserveSeparators()
.PreservePositionIncrements()).
Completion(s=>s.Name(n => n.City)
.IndexAnalyzer("simple")
.SearchAnalyzer("simple")
.MaxInputLength(50)
.Payloads()
.PreserveSeparators()
.PreservePositionIncrements())
)));
Edit:
How I'm indexing the elements:
public bool IndexLocations(IList<LocationInfo> locations)
{
var bulkParams = locations.Select(p => new BulkParameters<LocationInfo>(p){
Id = p.Id,
Timestamp = DateTime.Now.ToTimeStamp()
});
var response = _client.IndexMany(bulkParams, PlatformConfiguration.LocationIndexName);
return response.IsValid;
}
Edit
After viewing the mappings I changed my query to the following:
var response = _client.Search<LocationInfo>(location =>
location.Index(PlatformConfiguration.LocationIndexName).
SuggestCompletion("locationinfo", f => f.OnField("countryName").Text(text).Size(1)));
and I also I tried:
var response = _client.Search<LocationInfo>(location =>
location.Index(PlatformConfiguration.LocationIndexName).
SuggestCompletion("countryName", f => f.OnField("countryName").Text(text).Size(1)));
.....And I still get an empty result
the mapping
{
"locationindex": {
"mappings": {
"locationinfo": {
"properties": {
"countryName": {
"type": "completion",
"analyzer": "simple",
"payloads": true,
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
}
}
},
"bulkparameters`1": {
"properties": {
"document": {
"properties": {
"city": {
"type": "string"
},
"countryName": {
"type": "string"
},
"countryTwoDigitCode": {
"type": "string"
},
"id": {
"type": "string"
},
"latitude": {
"type": "string"
},
"longitude": {
"type": "string"
}
}
},
"id": {
"type": "string"
},
"timestamp": {
"type": "long"
},
"versionType": {
"type": "long"
}
}
}
}
}
}

The support for IndexMany() with wrapped BulkParameters has been removed in NEST 1.0.0 beta 1
If you want to use a bulk with more advanced parameters you now have to use the Bulk() command.
The beta sadly still shipped with the BulkParameters class in the assembly
This has since been removed in the develop branch.
So what happens now is that you are actually indexing "bulkparameters``1``" type documents and not "locationinfo". So the mapping specified for "locationinfo" does not come into play.
See here for an example on how to use Bulk() to index many objects at once while configuring advanced parameters for individual items.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

how to boost nested object with NEST - c#

Related

Applying Term, Filter and Aggregation in one Es Query using Nest

Flatten Json & Ignore array index Using ChoETL

Querying a subfield in documentdb

Group by inner list data

SuggestCompletion Nest usage

Categories

Resources