Why do the NEST ElasticClient did not find a document?

Why do the NEST ElasticClient did not find a document? - c#

When I use Kibana to execute the following Searchrequest to Elasticsearch
GET _search
{
"query": {
"query_string": {
"query": "PDB_W2237.docx",
"default_operator": "AND"
}
}
}
it returns:
{
"took": 14,
"timed_out": false,
"_shards": {
"total": 15,
"successful": 15,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 6.3527603,
"hits": [
{
"_index": "proconact",
"_type": "proconact",
"_id": "68cecf2c-7e5a-11e5-80fa-000c29bd9450",
"_score": 6.3527603,
"_source": {
"Id": "68cecf2c-7e5a-11e5-80fa-000c29bd9450",
"ActivityId": "1bad9115-7e5a-11e5-80fa-000c29bd9450",
"ProjectId": "08938a1d-2429-11e5-80f9-000c29bd9450",
"Filename": "PDB_W2237.docx"
}
}
]
}
}
When I use the NEST ElasticClient like
var client = new ElasticClient();
var searchResponse = client.Search<Hit>(new SearchRequest {
Query = new QueryStringQuery {
Query = "DB_W2237.docx",
DefaultOperator = Operator.And
}
});
it do return 0 Hits.
Here is the Indexmapping for the 4 Fields in the Hit:
{
"proconact": {
"mappings": {
"proconact": {
"properties": {
"ActivityId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Filename": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ProjectId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
Are the two search-requests not the same?

The problem is your mapping doesn't allow a token different than whatever is present in your index.
In your kibana query:
GET _search
{
"query": {
"query_string": {
"query": "PDB_W2237.docx",
"default_operator": "AND"
}
}
}
You're querying PDB_W2237.docx but in your NEST you're querying DB_W2237.docx.
If you want to query DB_W2237.docx and expecting results then you might have to change the analyzer from standard analyzer which is applied by default to something else a possible candidate depends on your usecase.

Related

Actions on google correct response

My C# web API has successfully received a request from my Actions on google dialog flow. But I am having issues understanding what the response format should be.
{
"responseId": "96ee6c11-8f73-409f-8dac-8b6141d08483",
"queryResult": {
"queryText": "History",
"action": "tell.fact",
"parameters": {
"category": "history"
},
"allRequiredParamsPresent": true,
"fulfillmentMessages": [
{
"text": {
"text": [
""
]
}
}
],
"outputContexts": [
{
"name": "projects/project--6162817918903295576/agent/sessions/1530877719318/contexts/google_assistant_input_type_touch",
"parameters": {
"category.original": "History",
"category": "history"
}
},
{
"name": "projects/project--6162817918903295576/agent/sessions/1530877719318/contexts/actions_capability_screen_output",
"parameters": {
"category.original": "History",
"category": "history"
}
},
{
"name": "projects/project--6162817918903295576/agent/sessions/1530877719318/contexts/choose_fact-followup",
"lifespanCount": 2,
"parameters": {
"category.original": "History",
"category": "history"
}
},
{
"name": "projects/project--6162817918903295576/agent/sessions/1530877719318/contexts/actions_capability_audio_output",
"parameters": {
"category.original": "History",
"category": "history"
}
},
{
"name": "projects/project--6162817918903295576/agent/sessions/1530877719318/contexts/actions_capability_media_response_audio",
"parameters": {
"category.original": "History",
"category": "history"
}
},
{
"name": "projects/project--6162817918903295576/agent/sessions/1530877719318/contexts/actions_capability_web_browser",
"parameters": {
"category.original": "History",
"category": "history"
}
}
],
"intent": {
"name": "projects/project--6162817918903295576/agent/intents/4a35cf33-e446-4b2b-a284-c70bc4dfce17",
"displayName": "choose_fact"
},
"intentDetectionConfidence": 1,
"languageCode": "en-us"
},
"originalDetectIntentRequest": {
"source": "google",
"version": "2",
"payload": {
"isInSandbox": true,
"surface": {
"capabilities": [
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.MEDIA_RESPONSE_AUDIO"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
},
"requestType": "SIMULATOR",
"inputs": [
{
"rawInputs": [
{
"query": "History",
"inputType": "TOUCH"
}
],
"arguments": [
{
"rawText": "History",
"textValue": "History",
"name": "text"
}
],
"intent": "actions.intent.TEXT"
}
],
"user": {
"lastSeen": "2018-07-06T11:44:24Z",
"locale": "en-US",
"userId": "AETml1TDDPgKmK2GqQ9ugHJc5hQM"
},
"conversation": {
"conversationId": "1530877719318",
"type": "ACTIVE",
"conversationToken": "[]"
},
"availableSurfaces": [
{
"capabilities": [
{
"name": "actions.capability.AUDIO_OUTPUT"
},
{
"name": "actions.capability.SCREEN_OUTPUT"
},
{
"name": "actions.capability.WEB_BROWSER"
}
]
}
]
}
},
"session": "projects/project--6162817918903295576/agent/sessions/1530877719318"
}
attempt one
The documentation webhook states that my response should look like that
{"fulfillmentText":"Hello from C# v2","fulfillmentMessages":[{"card":{"title":"card title","subtitle":"sub title","imageUri":"https://assistant.google.com/static/images/molecule/Molecule-Formation-stop.png","buttons":[{"text":"button text","postback":"https://assistant.google.com/"}]}}],"source":"example.com","payload":{"google":{"expectUserResponse":true,"richResponse":{"items":[{"simpleResponse":{"textToSpeech":"Thi sis a simple response "}}]}},"facebook":{"text":"Hello facebook"},"slack":{"text":"Hello facebook"}},"outputContexts":[{"name":"projects/project--6162817918903295576/agent/sessions/2a210c67-4355-d565-de81-4d3ee7439e67","lifespanCount":5,"parameters":{"param":"parm value"}}],"followupEventInput":{"name":"event name","languageCode":"en-Us","parameters":{"param":"parm value"}}}
This results in the following error
MalformedResponse 'final_response' must be set.
Failed to parse Dialogflow response into AppResponse because of empty speech response
attempt 2
So i went on to try simple response
{
"payload": {
"google": {
"expectUserResponse": true,
"richResponse": {
"items": [
{
"simpleResponse": {
"textToSpeech": "this is a simple response"
}
}
]
}
}
}
}
Which also results with
MalformedResponse 'final_response' must be set.
Failed to parse Dialogflow response into AppResponse because of empty speech response
I can verify that my requests are being sent as application/json
Does anyone know the proper response format?

The root problem is actually ASP.NET Core by default uses transfer-encoding: chunked for ActionResult and for some reason Dialogflow does not support parsing chunked transfer
The following response is accepted by actions.
[HttpPost]
public ContentResult Post([FromBody] FulfillmentRequst data)
{
var response = new FulfillmentResponse
{
fulfillmentText = "Hello from C# v2",
};
return Content(JsonConvert.SerializeObject(response), "application/json");
}

Here is the JSON response format: https://github.com/dialogflow/fulfillment-webhook-json/blob/master/responses/v2/response.json

NEST ElasticSearch.Raw.IndiciesCreatePost does not get correct mappings for index

I'm trying to use NEST to create an index with raw json and it produces different results than when I use that same json string interactively against elastic search. Is this a bug or am I using it incorrectly?
Posting directly to elastic search with the following command I get exactly the mappings I want for my index (results shown below)
POST /entities
{
"mappings": {
"sis_item" :
{
"properties":
{
"FullPath":
{
"type": "string",
"index":"not_analyzed"
},
"Ident":
{
"type": "nested",
"properties":
{
"ObjectGuid":
{
"type": "string",
"index":"not_analyzed"
}
}
}
}
}
}
Result when I check the index with: GET /entities/ : (which is correct)
{
"entities": {
"aliases": {},
"mappings": {
"sis_item": {
"properties": {
"FullPath": {
"type": "string",
"index": "not_analyzed"
},
"Ident": {
"type": "nested",
"properties": {
"ObjectGuid": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
},
"settings": {
"index": {
"creation_date": "1453828294488",
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "1070499"
},
"uuid": "6_j4vRcqRwiTQw0E6bQokQ"
}
},
"warmers": {}
}
}
However I have to do this from code and using the following code the mappings I specify end up in the settings instead of the mappings as shown below.
var createIndexJson = #"
{
""mappings"": {
""sis_item"" :
{
""properties"":
{
""FullPath"":
{
""type"": ""string"",
""index"":""not_analyzed""
},
""Ident"":
{
""type"": ""nested"",
""properties"":
{
""ObjectGuid"":
{
""type"": ""string"",
""index"":""not_analyzed""
}
}
}
}
}
}
}";
var response = _client.Raw.IndicesCreatePost("entities_from_code", createIndexJson);
if (!response.Success || response.HttpStatusCode != 200)
{
throw new ElasticsearchServerException(response.ServerError);
}
Result (not correct, notice the mappings are nested inside the settings):
{
"entities_from_code": {
"aliases": {},
"mappings": {},
"settings": {
"index": {
"mappings": {
"sis_item": {
"properties": {
"FullPath": {
"type": "string",
"index": "not_analyzed"
},
"Ident": {
"type": "nested",
"properties": {
"ObjectGuid": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
},
"creation_date": "1453828283162",
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "1070499"
},
"uuid": "fdmPqahGToCJw_mIbq0kNw"
}
},
"warmers": {}
}
}

There is a newline at the very top of the json string which cause the odd result, removing it gave me the expected behaviour.

ElasticSearch Aggregation Group by order by sub terms field doc count

My mapping model:
// TypeLog: Error, Info, Warn
{
"onef-sora": {
"mappings": {
"Log": {
"properties": {
"application": {
"type": "string",
"index": "not_analyzed"
}
"typeLog": {
"type": "string"
}
}
}
}
}
}
My query:
{
"size": 0,
"aggs": {
"application": {
"terms": {
"field": "application",
"order" : { "_count" : "desc"},
"size": 5
},
"aggs": {
"typelogs": {
"terms": {
"field": "typeLog",
"order" : { "_term" : "asc"}
}
}
}
}
}
}
I want get top 5 application has most error, but term aggregation order support three key : _count, _term, _key. How do I order by typeLog doc_count in my query. Thanks !!!
Result I want:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 10000,
"max_score": 0,
"hits": []
},
"aggregations": {
"application": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 5000,
"buckets": [
{
"key": "OneF0",
"doc_count": 1000,
"typelogs": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "error",
"doc_count": 334
},
{
"key": "info",
"doc_count": 333
},
{
"key": "warn",
"doc_count": 333
}
]
}
},
{
"key": "OneF1",
"doc_count": 1000,
"typelogs": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "error",
"doc_count": 333
},
{
"key": "info",
"doc_count": 334
},
{
"key": "warn",
"doc_count": 333
}
]
}
},
{
"key": "OneF2",
"doc_count": 1000,
"typelogs": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "error",
"doc_count": 332
},
{
"key": "info",
"doc_count": 333
},
{
"key": "warn",
"doc_count": 334
}
]
}
}
]
}
}
}

As you to get the top 5 applications with most errors, you can filter to keep only error logs in query (you could use a filter). Then you only need order you sub-term aggregation by descending count
{
"size": 0,
"query": {
"term": {
"typeLog": "Error"
}
},
"aggs": {
"application": {
"terms": {
"field": "application",
"order": {
"_count": "desc"
},
"size": 5
},
"aggs": {
"typelogs": {
"terms": {
"field": "typeLog",
"order": {
"_count": "desc"
}
}
}
}
}
}
}
To keep all typeLogs, you may need to perform your query the other way
{
"size": 0,
"aggs": {
"typelogs": {
"terms": {
"field": "typeLog",
"order": {
"_count": "asc"
}
},
"aggs": {
"application": {
"terms": {
"field": "application",
"order": {
"_count": "desc"
},
"size": 5
}
}
}
}
}
}
You will have 3 first level buckets, the the top 5 applications by type of log

Is Anything wrong in my query .Nest elastic C#

Is there something wrong in my .Nest libs query? My query will get all data, I need to get by multi term.
Query string elastic result i want:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1000,
"max_score": 0,
"hits": []
},
"aggregations": {
"log_query": {
"doc_count": 2,
"histogram_Log": {
"buckets": [
{
"key_as_string": "06/02/2015 12:00:00",
"key": 1423180800000,
"doc_count": 1
},
{
"key_as_string": "21/02/2015 12:00:00",
"key": 1424476800000,
"doc_count": 1
}
]
}
}
}
}
My query string elastic:
{
"size": 0,
"aggs": {
"log_query": {
"filter": {
"bool": {
"must": [
{
"term": {
"cluster": "giauht1"
}
},
{
"term": {
"server": "hadoop0"
}
},
{
"term": {
"type": "Warn"
}
},
{
"range": {
"actionTime": {
"gte": "2015-02-01",
"lte": "2015-02-24"
}
}
}
]
}
},
"aggs": {
"histogram_Log": {
"date_histogram": {
"field": "actionTime",
"interval": "1d",
"format": "dd/MM/YYYY hh:mm:ss"
}
}
}
}
}
}
My .nest libs query:
Func<SearchDescriptor<LogInfoIndexView>, SearchDescriptor<LogInfoIndexView>> query =
que => que.Aggregations(aggs => aggs.Filter("log_query", fil =>
{
fil.Filter(fb => fb.Bool(fm => fm.Must(
ftm =>
{
ftm.Term(t => t.Cluster, cluster);
ftm.Term(t => t.Server, server);
ftm.Term(t => t.Type, logLevel);
ftm.Range(r => r.OnField("actionTime").GreaterOrEquals(from.Value).LowerOrEquals(to.Value));
return ftm;
}))).Aggregations(faggs => faggs.DateHistogram("histogram_Log", dr =>
{
dr.Field("actionTime");
dr.Interval("1d");
dr.Format("dd/MM/YYYY hh:mm:ss");
return dr;
}));
return fil;
})).Size(0).Type(new LogInfoIndexView().TypeName);
var result = client.Search(query);
My .nest result:
My model mapping:
{
"onef-sora": {
"mappings": {
"FPT.OneF.Api.Log": {
"properties": {
"actionTime": {
"type": "date",
"format": "dateOptionalTime"
},
"application": {
"type": "string",
"index": "not_analyzed"
},
"cluster": {
"type": "string",
"index": "not_analyzed"
},
"detail": {
"type": "string",
"index": "not_analyzed"
},
"iD": {
"type": "string"
},
"message": {
"type": "string",
"index": "not_analyzed"
},
"server": {
"type": "string",
"index": "not_analyzed"
},
"source": {
"type": "string",
"index": "not_analyzed"
},
"tags": {
"type": "string",
"index": "not_analyzed"
},
"type": {
"type": "string",
"index": "not_analyzed"
},
"typeLog": {
"type": "string"
},
"typeName": {
"type": "string"
},
"url": {
"type": "string",
"index": "not_analyzed"
},
"user": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}

The Must() condition passed to the Bool() filter takes a params Func<FilterDescriptor<T>, FilterContainer>[] but in your filter, the Term() and Range() filters are chained onto the same filter instance; unfortunately, this doesn't work as you might expect and the end result is actually an empty json object passed to the must clause in the query DSL for the filter i.e. you end up with
{
"size": 0,
"aggs": {
"log_query": {
"filter": {
"bool": {
"must": [
{} /* where are the filters?! */
]
}
},
"aggs": {
"histogram_Log": {
"date_histogram": {
"field": "actionTime",
"interval": "1d",
"format": "dd/MM/YYYY hh:mm:ss"
}
}
}
}
}
}
The solution is to pass an array of Func<FilterDescriptor<T>, FilterContainer>; The following matches your query DSL
void Main()
{
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var connection = new InMemoryConnection(settings);
var client = new ElasticClient(connection: connection);
DateTime? from = new DateTime(2015, 2,1);
DateTime? to = new DateTime(2015, 2, 24);
var docs = client.Search<LogInfoIndexView>(s => s
.Size(0)
.Type("type")
.Aggregations(a => a
.Filter("log_query", f => f
.Filter(ff => ff
.Bool(b => b
.Must(m => m
.Term(t => t.Cluster, "giauht1"),
m => m
.Term(t => t.Server, "hadoop0"),
m => m
.Term(t => t.Type, "Warn"),
m => m
.Range(r => r.OnField("actionTime").GreaterOrEquals(from.Value).LowerOrEquals(to.Value))
)
)
)
.Aggregations(aa => aa
.DateHistogram("histogram_Log", da => da
.Field("actionTime")
.Interval("1d")
.Format("dd/MM/YYYY hh:mm:ss")
)
)
)
)
);
Console.WriteLine(Encoding.UTF8.GetString(docs.RequestInformation.Request));
}
public class LogInfoIndexView
{
public string Cluster { get; set; }
public string Server { get; set; }
public string Type { get; set; }
public DateTime ActionTime { get; set; }
}
returning
{
"size": 0,
"aggs": {
"log_query": {
"filter": {
"bool": {
"must": [
{
"term": {
"cluster": "giauht1"
}
},
{
"term": {
"server": "hadoop0"
}
},
{
"term": {
"type": "Warn"
}
},
{
"range": {
"actionTime": {
"lte": "2015-02-24T00:00:00.000",
"gte": "2015-02-01T00:00:00.000"
}
}
}
]
}
},
"aggs": {
"histogram_Log": {
"date_histogram": {
"field": "actionTime",
"interval": "1d",
"format": "dd/MM/YYYY hh:mm:ss"
}
}
}
}
}
}
EDIT:
In answer to your comment, the difference between a filtered query filter and a filter aggregation is that the former applies the filtering to all documents at the start of the query phase and filters are generally cached, improving performance on subsequent queries with those filters, whilst the latter applies in the scope of the aggregation to filter documents in the current context to a single bucket. If your query is only to perform the aggregation and you're likely to run the aggregation with the same filters, I think the filtered query filter should offer better performance.

ElasticSearch: Retrieving the latest document for a given timestamp

I have an ElasticSearch database with some documents in it. Each documents has its own timestamp field.
I currently have a WebApi which requires two timestamps, startTime and endTime. The WebApi simply performs a query on ES to grab the documents which have the timestamps in the given range.
This is my current query:
var readRecords = ElasticClient.Search<SegmentRecord>(s => s
.Index(ElasticIndexName)
.Filter(f =>
f.Range(i =>
i.OnField(a => a.DateTime).GreaterOrEquals(startTime).LowerOrEquals(endTime))).Size(MaximumNumberOfReturnedDocs).SortAscending(p => p.DateTime)).Documents;
Very simple, it's basically a range query based on the startTime and endTime parameters. And it works. :-)
Now the problem is: I need to retrieve even the latest document which has got a timestamp lower than startTime.
So basically the final query should be:
all the document in the range [startTime, endTime]
AND
the latest document in time which has a timestamp < startTime
the first part obviously can return any number of records, zero, just one or many
the second part should return just one document, (or zero if doesn't exist any document prior to starTime)

Something like this I meant in my comment above:
{
"query": {
"filtered": {
"filter": {
"range": {
"time": {
"gte": "2015-06-04",
"lte": "2015-06-05"
}
}
}
}
},
"aggs": {
"global_all_docs_agg": {
"global": {},
"aggs": {
"filter_for_min": {
"filter": {
"range": {
"time": {
"lte": "2015-06-04"
}
}
},
"aggs": {
"min_date": {
"top_hits": {
"size": 1,
"sort": [
{
"time": "asc"
}
]
}
}
}
}
}
}
}
}
The result looks like this:
"hits": [
{
"_index": "sss",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"time": "2015-06-05"
}
},
{
"_index": "sss",
"_type": "test",
"_id": "2",
"_score": 1,
"_source": {
"time": "2015-06-04"
}
},
{
"_index": "sss",
"_type": "test",
"_id": "4",
"_score": 1,
"_source": {
"time": "2015-06-05"
}
}
]
},
"aggregations": {
"global_all_docs_agg": {
"doc_count": 6,
"filter_for_min": {
"doc_count": 4,
"min_date": {
"hits": {
"total": 4,
"max_score": null,
"hits": [
{
"_index": "sss",
"_type": "test",
"_id": "5",
"_score": null,
"_source": {
"time": "2015-06-01"
},
"sort": [
1433116800000
]
}
]
}
}
}
}
}
The list between startTime and endTime is under hits. The minimum lower than startTime is under aggregations.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why do the NEST ElasticClient did not find a document? - c#

Related

Actions on google correct response

NEST ElasticSearch.Raw.IndiciesCreatePost does not get correct mappings for index

ElasticSearch Aggregation Group by order by sub terms field doc count

Is Anything wrong in my query .Nest elastic C#

ElasticSearch: Retrieving the latest document for a given timestamp

Categories

Resources