Elasticsearch can't read ScriptedMetric from BucketScript?

Elasticsearch can't read ScriptedMetric from BucketScript? - c#

I should work since the ScriptedMetric is a metric, and it does return a single numeric value, but I can't get it to work.
I'm using NEST (5.5.0 via NuGet, I'm targeting Elasticsearch 6.0.0) in C# to get it to work, I did however also try building the same query in Kibana to rule out an issue with NEST. And in Kibana I'm getting exactly the same error.
The error:
buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.elasticsearch.search.aggregations.metrics.scripted.InternalScriptedMetric
My code:
ISearchResponse<LogItem> aggregationResponse = await client.SearchAsync<LogItem>(s => s
.Size(0)
.Type("errordoc")
.Query(...)
.Aggregations(a => a
.Terms("Hash", st => st
.Field(o => o.messageHash.Suffix("keyword")).OrderDescending("Avg-Score")
.Aggregations(aa => aa
.Terms("Friendly", ff => ff
.Field(oo => oo.friendly.Suffix("keyword"))
)
.Max("Max-DateTime", ff => ff
.Field(oo => oo.dateTimeStamp)
)
.Average("Avg-Score", sc => sc
.Script("_score")
)
.ScriptedMetric("Urgency-Level", sm => sm
.InitScript(i => i.Inline("params._agg.data = []").Lang("painless"))
.MapScript(i => i.Inline("params._agg.data.add(doc.urgency.value)").Lang("painless"))
.CombineScript(i => i.Inline("int urgency = 0; for (u in params._agg.data) { urgency += u } return urgency").Lang("painless"))
.ReduceScript(i =>i.Inline("int urgency = 0; for (a in params._aggs) { urgency += a } return urgency").Lang("painless"))
)
.BucketScript("finalScore", scb => scb
.BucketsPath(bp => bp
.Add("maxDateTime", "Max-DateTime")
.Add("avgScore", "Avg-Score")
.Add("urgencyLevel", "Urgency-Level")
)
.Script(i => i.Inline("params.avgScore").Lang("painless"))
)
)
)
)
);
The ScriptedMetric aggregation is returning an 11 with my data set.
Am I doing something wrong? Or Is this not possible? If not possible, what would be an alternative?
Also, I know this ScriptedMetric does pretty much do what a Sum would do, but that's going to change of course...
Index mapping:
PUT /live
{
"mappings": {
"errordoc": {
"properties": {
"urgency": {
"type": "integer"
},
"dateTimeStamp": {
"type": "date",
"format": "MM/dd/yyyy hh:mm:ss a"
} }
}
}
}
Test data:
POST /live/errordoc
{ "messageID": "M111111", "messageHash": "1463454562\/-1210136287\/-1885530817\/-275007043\/-57589585", "friendly": "0", "urgency": "1", "organisation": "Organisation Name", "Environment": "ENV02", "Task": "TASK01", "Action": "A12", "dateTimeStamp": "11\/29\/2017 10:24:21 AM", "machineName": "DESKTOP-SMOM9R9", "parameters": "{ " }
Copy this document a couple of times, maybe changing the urgency/dateTimeStamp, as long as the Hash stays the same it should reproduce my environment...

Somebody on the Elasticsearch forums replied after I created 2 GitHub issues...
I did try everything, including this solution, but I must have done something else wrong during that try... Well whatever...
The solution
Change:
.Add("urgencyLevel", "Urgency-Level")
to:
.Add("urgencyLevel", "Urgency-Level.value")

Related

Turkish character problem in elasticsearch

When I search with Turkish characters in elasticsearch, it does not match. For example, when I type "yazilim", the result comes, but when I type "Yazılım", no result. The correct one is "Yazılım".
My index code.
var createIndexDescriptor = new CreateIndexDescriptor(INDEX_NAME).Mappings(ms => ms.Map<T>(m => m.AutoMap()
.Properties(pprops => pprops
.Text(ps => ps
.Name("Title")
.Fielddata(true)
.Fields(f => f
.Keyword(k => k
.Name("keyword")))))
)).Settings(st => st
.Analysis(an => an
.Analyzers(anz => anz
.Custom("tab_delim_analyzer", td => td
.Filters("lowercase", "asciifolding")
.Tokenizer("standard")
)
)
)
);
my search query code.
var searchResponse = eClient.Search<GlobalCompany>(s => s.Index(INDEX_NAME).From(0).Size(10)
.Query(q => q
.MultiMatch(m => m
.Fields(f => f
.Field(u => u.Title)
.Field(u => u.RegisterNumber))
.Type(TextQueryType.PhrasePrefix)
.Query(value))));

You are using an asciifolding filter, it makes sure ASCII characters are used (see docs).

You need to configure your field Title as a text field instead of a keyword field and set the analyzer for this field to tab_delim_analyzer.
I don't know how to translate this in dotNet world but here is what I mean in pure Kibana Dev Console script (curl):
DELETE deneme
PUT deneme
{
"settings": {
"analysis": {
"analyzer": {
"tab_delim_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"properties": {
"Title": {
"type": "text",
"analyzer": "tab_delim_analyzer"
}
}
}
}

Dynamic Elastic search query in c# NEST

Started working on NEST api for elastic search recently, got stuck on following query, the data.e would be dynamically populated using the input from client in the HttpGet,
ex: user sends eventA,eventB,eventC then we would add in the should part:
GET events/_search
{
"_source": false,
"query": {
"bool": {
"must": [
{"range": {
"timestamp": {
"gte": 1604684158527,
"lte": 1604684958731
}
}},
{"nested": {
"path": "data",
"query": {
"bool": {
"should": [
{"match": {
"data.e": "eventA"
}},
{"match": {
"data.e": "eventB"
}},
{"match": {
"data.e": "eventC"
}},
]
}
},
"inner_hits": {}
}}
]
}
}
}
Following is what I came up with till now:
var graphDataSearch = _esClient.Search<Events>(s => s
.Source(src => src
.Includes(i => i
.Field("timestamp")
)
)
.Query(q => q
.Bool(b => b
.Must(m => m
.Range(r => r
.Field("timestamp")
.GreaterThanOrEquals(startTime)
.LessThanOrEquals(stopTime)
),
m => m
.Nested(n => n
.Path("data")
.Query(q => q
.Bool(bo => bo
.Should(
// what to add here?
)
)
)
)
)
));
Can someone please help how to build the should part dynamically based on what input the user sends?
Thanks.

You can replace the nested query in the above snippet as shown below
// You may modify the parameters of this method as per your needs to reflect user input
// Field can be hardcoded as shown here or can be fetched from Event type as below
// m.Field(f => f.Data.e)
public static QueryContainer Blah(params string[] param)
{
return new QueryContainerDescriptor<Events>().Bool(
b => b.Should(
s => s.Match(m => m.Field("field1").Query(param[0])),
s => s.Match(m => m.Field("field2").Query(param[1])),
s => s.Match(m => m.Field("field3").Query(param[2]))));
}
What we are essentially doing here is we are returning a QueryContainer object that will be passed to the nested query
.Query(q => Blah(<your parameters>))
The same can be done by adding this inline without a separate method. You may choose which ever route you perfer. However, in general, having a method of its own increases the readability and keeps things cleaner.
You can read more about Match usage here
Edit:
Since you want to dynamically add the match queries inside this, below is a way you can do it.
private static QueryContainer[] InnerBlah(string field, string[] param)
{
QueryContainer orQuery = null;
List<QueryContainer> queryContainerList = new List<QueryContainer>();
foreach (var item in param)
{
orQuery = new MatchQuery() {Field = field, Query = item};
queryContainerList.Add(orQuery);
}
return queryContainerList.ToArray();
}
Now, call this method from inside of the above method as shown below
public static QueryContainer Blah(params string[] param)
{
return new QueryContainerDescriptor<Events>().Bool(
b => b.Should(
InnerBlah("field", param)));
}

Nest Elasticsearch wildcard query works as querystring but not with fluent API

I have about a hundred test documents in my index, built using NBuilder:
[
{
"title" : "Title1",
"text" : "Text1"
},
{
"title" : "Title2",
"text" : "Text2"
},
{
"title" : "Title3",
"text" : "Text3"
}
]
I want to query them with a wildcard to find all items with "text" starts with "Text". But when I use two wildcard methods in Nest I get two different results.
var response = await client.SearchAsync<FakeFile>(s => s.Query(q => q
.QueryString(d => d.Query("text:Text*")))
.From((page - 1) * pageSize)
.Size(pageSize));
This returns 100 results. But I'm trying to use a fluent API rather than querystring.
var response = await client.SearchAsync<FakeFile>(s => s
.Query(q => q
.Wildcard(c => c
.Field(f => f.Text)
.Value("Text*"))));
This returns 0 results. I'm new to Elasticsearch. I've tried to make the example as simple as possible to make sure I understand it piece-by-piece. I don't know why nothing is returning from the second query. Please help.

Assuming your text field is of type text, then during indexing elasticsearch will store Text1 as text1 internally in the inverted index. Exactly the same analysis will happen when using query string query, but not when you are using wildcard query.
.QueryString(d => d.Query("text:Text*"))) looks for text* and .Wildcard(c => c.Field(f => f.Text).Value("Text*"))) looks for Text* but elasticsearch stores internally only first one.
Hope that helps.

Supposed your mapping looks like that:
{
"mappings": {
"doc": {
"properties": {
"title": {
"type": "text"
},
"text":{
"type": "text"
}
}
}
}
}
Try this (Value should be in lowercase):
var response = await client.SearchAsync<FakeFile>(s => s
.Query(q => q
.Wildcard(c => c
.Field(f => f.Text)
.Value("text*"))));
Or this (don't know if f.Text has text property on it):
var response = await client.SearchAsync<FakeFile>(s => s
.Query(q => q
.Wildcard(c => c
.Field("text")
.Value("text*"))));
Kibana syntax:
GET index/_search
{
"query": {
"wildcard": {
"text": {
"value": "text*"
}
}
}
}

Grouping in Elastic Search

I have a problem with doing grouping operation in elastcicSearch.
Actually, I have 3 fields in my document. that is as under:
Id Type Year
Now I want to do grouping on ExceptionType and Year and count it in "ResultCount".
I tried this one but it is not working:
.Aggregations(a => a
.ValueCount("ResultCount", c => c
.Field(p => p.Id)
.Field(p=> p.Year)
))
.Aggregations(a => a
.Terms("Type", st => st
.Field(o => o.Type)
.Size(10))).Size(5)
.Aggregations(aa => aa
.Max("Year", m => m
.Field(o => o.Year)
))
);
Please give a solution for this problem as soon as possible. thank you.

Here I will try to help. There is no such thing as group in elastic only terms and sub aggregations. If you want the count of error types by year you can do it like so:
var response = db.Search<Error>(s =>
s.Size(0)
.Aggregations(aggs1 => aggs1.Terms("level1",
l1 =>l1.Field(f => f.Type)
.Aggregations(aggs2 => aggs2.Terms("leve2", t=> t.Field(f=>f.Year))))));
foreach (var l1 in response.Aggs.Terms("level1").Buckets)
{
foreach (var l2 in l1.Terms("leve2").Buckets)
{
Console.WriteLine("Type:{0}, Year:{1}, Count:{2}", l1.Key, l2.Key, l2.DocCount);
}
}
But keep in mind that this will only work as you except for a non-analysed or keyword fields if you want terms for dates you can use something like date_histogram aggregation.
If you index mapping looks like this
{
"properties": {
"type": {
"type": "keyword"
},
"year": {
"type": "long"
}
}
}
You will get the number of items per error type per year
You can do other types of aggregations as well in the nested aggregation
NOTE: The terms aggregation is approximated so don't be surprised if you don't get exact values.

MultiMatch query with Nest and Field Suffix

Using Elasticsearch I have a field with a suffix - string field with a .english suffix with an english analyser on it as shown in the following mapping
...
"valueString": {
"type": "string",
"fields": {
"english": {
"type": "string",
"analyzer": "english"
}
}
}
...
The following query snippet won't compile because ValueString has no English property.
...
sh => sh
.Nested(n => n
.Path(p => p.ScreenData)
.Query(nq => nq
.MultiMatch(mm => mm
.Query(searchPhrase)
.OnFields(
f => f.ScreenData.First().ValueString,
f => f.ScreenData.First().ValueString.english)
.Type(TextQueryType.BestFields)
)
)
)...
Is there a way to strongly type the suffix at query time in NEST or do I have to use magic strings?

Did you try to use extension method Suffix?
This is how you can modify your query:
...
.OnFields(
f => f.ScreenData.First().ValueString,
f => f.ScreenData.First().ValueString.Suffix("english"))
.Type(TextQueryType.BestFields)
...
Hope it helps.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Elasticsearch can't read ScriptedMetric from BucketScript? - c#

Related

Turkish character problem in elasticsearch

Dynamic Elastic search query in c# NEST

Nest Elasticsearch wildcard query works as querystring but not with fluent API

Grouping in Elastic Search

MultiMatch query with Nest and Field Suffix

Categories

Resources