Configure standard stop words for ElasticSearch using NEST

Configure standard stop words for ElasticSearch using NEST - c#

I'm having trouble configuring stop words in ElasticSearch using the NEST client. Here's what my index definition looks like:
var createIndexResponse = _client.CreateIndex(IndexName, c => c
.Settings(s => s
.Analysis(a => a
.Analyzers(aa => aa.Stop("pfstop", st => st.StopWords("_english_"))
)
)
)
.Mappings(m => m
.Map<SearchTopic>(mm => mm
.Properties(p => p
.Text(t => t
.Name(n => n.Posts)
.Name(n => n.FirstPost)
.Name(n => n.Title)
.SearchAnalyzer("pfstop")
)
)
)
)
);
And here's my query (and yes, I'm only wanting to return the ID):
var searchResponse = _client.Search<SearchTopic>(s => s
.Source(sf => sf.Includes(i => i.Fields(f => f.Id)))
.Query(q => q.MultiMatch(m => m.Query(searchTerm)
.Fields(f => f
.Field(x => x.Title, boost: 20)
.Field(x => x.FirstPost, boost: 2)
.Field(x => x.Posts))))
.Take(pageSize)
.Skip(startRow));
If my searchTerm is "Simon and Diana," I get results from any row that has "and" in it, which should be filtered out by way of the stop words.

Fluent syntax strikes again. After some experimentation, starting with including just one field in the mapping, I learned you have to split out the fields and their analyzer pairings. The correct syntax was this:
.Mappings(m => m
.Map<SearchTopic>(mm => mm
.Properties(p => p
.Text(t => t
.Name(n => n.Posts)
.Analyzer("pfstop")
)
.Text(t => t
.Name(n => n.FirstPost)
.Analyzer("pfstop")
)
.Text(t => t
.Name(n => n.Title)
.Analyzer("pfstop")
)
)
)
)

Related

SQL IN clause - NEST C# - ElasticSearch Terms not working with anothers filters

I'm using NEST 6.0.1 with same elasticsearch version.
I'm trying to make a select using .Terms. Alone, it works ok, but together with others filters like .Must... seems the .Terms is ignoring the .Must filters.
// Params comming in request method:
ElasticClient client, int maximumRows, string jobId, string merchantId, string category, ICollection<int> priorityFilterCollection
var searchResponse = client.Search<LogEntity>(s => s
.From(0)
.Size(maximumRows)
.Query(q => q
.Bool(b => b
.Must(
sd => sd.MatchPhrase(m => m
.Field(f => f.JobId)
.Query(jobId)
)
)
.Must(
sd => sd.MatchPhrase(m => m
.Field(f => f.MerchantId)
.Query(merchantId)
)
)
.Must(
sd => sd.MatchPhrase(m => m
.Field(f => f.Category)
.Query(category)
)
)
.Must(
sd => sd.Terms(m => m
.Field(f => f.Priority)
.Terms<int>(priorityFilterCollection)
)
)
)
)
);

A call to Must on the bool query descriptor is assignative, so with multiple calls, only the last one is assigned to the must clause. The bool query needs to be rewritten in order to pass multiple must clauses
var searchResponse = client.Search<LogEntity>(s => s
.From(0)
.Size(maximumRows)
.Query(q => q
.Bool(b => b
.Must(
sd => sd.MatchPhrase(m => m
.Field(f => f.JobId)
.Query(jobId)
),
sd => sd.MatchPhrase(m => m
.Field(f => f.MerchantId)
.Query(merchantId)
),
sd => sd.MatchPhrase(m => m
.Field(f => f.Category)
.Query(category)
),
sd => sd.Terms(m => m
.Field(f => f.Priority)
.Terms<int>(priorityFilterCollection)
)
)
)
)
);
Check out the Writing bool queries documentation

How do you get mixed results when searching multiple types with ElasticSearch 2.x using NEST?

I'm pretty new to Elastic Search and stumbled upon this issue.
When searching multiple document types from the same index, the types are getting appended in the result documents set instead of being default sorted by boosted field.
My document types shares the same fields.
This is my search query:
var response = await client.SearchAsync<ProductListResponse>(s => s
.Type("product,productbundle")
.Index(index)
.From(from)
.Size(size)
.Query(fsq => fsq
.FunctionScore(c => c.Query(q => q
.MultiMatch(m => m.Query(request.Query)
.Fields(f => f
.Field(n => n.Name, 100.0)
.Field(n => n.NameWithoutSpecialChars, 100.0)
.Field(n => n.ProductName)
.Field(n => n.TeaserText)
.Field(n => n.Description)
.Field(n => n.Features)
.Field(n => n.Modules)
)
.Type(TextQueryType.PhrasePrefix)
)
).Functions(f => f
.FieldValueFactor(b => b
.Field(p => p.IsBoosted)
.Modifier(FieldValueFactorModifier.Log1P))
)
)
)
.Sort(ss => ss
.Descending(SortSpecialField.Score))
.PostFilter(filter => filter.Bool(b => b.Must(must => allFilters)))
.Source(sr => sr
.Include(fi => fi
.Field(f => f.Name)
.Field(n => n.ProductName)
.Field(n => n.TeaserText)
.Field(f => f.Image)
.Field(f => f.Thumbnail)
.Field(f => f.Url)
.Field(f => f.Features)
)
)
);
Any help is appreciated.
I would preferre not to adapt the product type with the additons to productbundle type..

I can confirm that .Type() does not mess with the order.
My issue was a boosted property not getting a value while indexing the bundles.

How to reduce data at reply from elastic by GET

I quire data from elastic and I get a big bunch of information.
I would like to get only two properties with values (key value pairs): timestamp and value, but I get lots, all the other information too.
How can I require only my properties I want? I tried like I read at elastic.co, but I still get always to full bunch of data.
Here my tries:
var result = ElasticClient.Search<_doc>(document =>
document
.Source(sf => sf
.Includes(i => i
.Fields(
f => f.Timestamp,
f => f.Value
)
)
.Excludes(e => e
.Fields(
f => f.ContextName
)
)
)
.Query(q => q
.Match(m => m
.Field(f => f.DataRecordId)
.Query(search)
)
)
);
Or:
var result = ElasticClient.Search<_doc>(document =>
document
.StoredFields(sf => sf
.Fields(
f => f.Timestamp,
f => f.Value
)
)
.Query(q => q
.Match(m => m
.Field(f => f.DataRecordId)
.Query(search)
)
)
);
Both return a big bag of data, much more than only Timestamp and Value.

So I get still all properties, but the excluded ones are null. I hope I will reduce traffic this way for better performance.
I'm open to other solutions.
var result = ElasticClient.Search<_doc>(document =>
document
.Source(src => src
.Includes(i => i
.Fields(
p => p.Timestamp,
p => p.Value
)
)
.Excludes(e => e
.Fields(
p => p.ComponentId,
p => p.ContextName,
p => p.DataRecordId,
p => p.ResourceId
)
)
)
.Query(q => q
.Match(m => m
.Field(f => f.DataRecordId)
.Query(search)
)
)
);

Remove duplicate UserIDs (field) from QueryContainer

I am having some trouble with a certain query in elastic search C#.
I have this QueryContainer, with an inner QueryDescriptor and alot of inner QueryContainers \ QueryDescriptors,
but one main QueryContainer => this._QueryContainer that contains all the data.
the thing is, that the field UserID is not unique in this._QueryContainer, so when i return 20 unique users, first time all is ok, but next 20 users (for next page) i wouldn't know where to start this.From...
because the this._QueryContainer has duplicates but return unique because of aggregation. so there is a conflict.
Is there a way to make the query distinct from the start?
results = Client.Search<Users>(s => s
.From(this.From)
.Query(this._QueryContainer)
.Aggregations(a => a
.Terms("unique", te => te
.Field(p => p.UserID)
)
)
.Size(20)
);

The .From() and .Size() within your query do not affect the Terms aggregation that you have, they apply only to the .Query() part and the hits returned from this.
If you need to return lots of values from a Terms aggregation, which is what I think you'd like to do, you can
1.Use partitioning to filter values, returning a large number of terms in multiple requests e.g.
var response = client.Search<Users>(s => s
.Aggregations(a => a
.Terms("unique", st => st
.Field(p => p.UserID)
.Include(partition: 0, numberOfPartitions: 10)
.Size(10000)
)
)
);
// get the next partition
response = client.Search<Users>(s => s
.Aggregations(a => a
.Terms("unique", st => st
.Field(p => p.UserID)
.Include(partition: 1, numberOfPartitions: 10)
.Size(10000)
)
)
);
2.Use a Composite Aggregation with a Terms value source
var response = client.Search<Users>(s => s
.Aggregations(a => a
.Composite("composite", c => c
.Sources(so => so
.Terms("unique", st => st
.Field(p => p.UserID)
)
)
)
)
);
// the following would be in a loop, to get all terms
var lastBucket = response.Aggregations.Composite("composite").Buckets.LastOrDefault();
if (lastBucket != null)
{
// get next set of terms
response = client.Search<Users>(s => s
.Aggregations(a => a
.Composite("composite", c => c
.Sources(so => so
.Terms("unique", st => st
.Field(p => p.UserID)
)
)
.After(lastBucket.Key)
)
)
);
}

NEST ElasticSearch for Did you mean feature

I am trying to implement "Did you mean" feature like google in my windows desktop application.
I have created a POC which inserts "Name" and "Description" in my index say "MyIndex"
I am able to do full text search, but unable to do something like "Did you mean".
here is a code snipt that I found in NEST documentation and i am unable to understand it:
s => s
.Suggest(ss => ss
.Term("my-term-suggest", t => t
.MaxEdits(1)
.MaxInspections(2)
.MaxTermFrequency(3)
.MinDocFrequency(4)
.MinWordLength(5)
.PrefixLength(6)
.SuggestMode(SuggestMode.Always)
.Analyzer("standard")
.Field(p => p.Name)
.ShardSize(7)
.Size(8)
.Text("hello world")
)
.Completion("my-completion-suggest", c => c
.Contexts(ctxs => ctxs
.Context("color",
ctx => ctx.Context(Project.First.Suggest.Contexts.Values.SelectMany(v => v).First())
)
)
.Fuzzy(f => f
.Fuzziness(Fuzziness.Auto)
.MinLength(1)
.PrefixLength(2)
.Transpositions()
.UnicodeAware(false)
)
.Analyzer("simple")
.Field(p => p.Suggest)
.Size(8)
.Prefix(Project.Instance.Name)
)
.Phrase("my-phrase-suggest", ph => ph
.Collate(c => c
.Query(q => q
.Inline("{ \"match\": { \"{{field_name}}\": \"{{suggestion}}\" }}")
.Params(p => p.Add("field_name", "title"))
)
.Prune()
)
.Confidence(10.1)
.DirectGenerator(d => d
.Field(p => p.Description)
)
.GramSize(1)
.Field(p => p.Name)
.Text("hello world")
.RealWordErrorLikelihood(0.5)
)
)
what is color, doing here?
Also what is this " ctx => ctx.Context(Project.First.Suggest.Contexts.Values.SelectMany(v => v).First()"
and this ".Prefix(Project.Instance.Name)".
Am I on the right path?
Please help.

The did you mean feature is more likely Term suggester (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html)
Completion is autocompletion. When you write "so" in searchbox it will give you "sony", "soly" ..etc. So in this case you wont need this feature.
(https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html)
Phrase is advanced Term suggester, phrase gives you to choice pre-selected suggestions with a mapping as document says
(https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-phrase.html)
You need this snippet in nest
s => s
.Suggest(ss => ss
.Term("my-term-suggest", t => t
.MaxEdits(1)
.MaxInspections(2)
.MaxTermFrequency(3)
.MinDocFrequency(4)
.MinWordLength(5)
.PrefixLength(6)
.SuggestMode(SuggestMode.Always)
.Analyzer("standard")
.Field(p => p.Name)
.ShardSize(7)
.Size(8)
.Text("hello world")
))

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Configure standard stop words for ElasticSearch using NEST - c#

Related

SQL IN clause - NEST C# - ElasticSearch Terms not working with anothers filters

How do you get mixed results when searching multiple types with ElasticSearch 2.x using NEST?

How to reduce data at reply from elastic by GET

Remove duplicate UserIDs (field) from QueryContainer

NEST ElasticSearch for Did you mean feature

Categories

Resources