Currently I started using Elasticsearch wrapper for c# "NEST", and I'm facing some troubles for writing queries that check for partial similarities such as in "book" and "books", so when I have a document that contains "books", if I search for "book" it doesn't find it:
here is my code :
var articles = client.Search<ProductResult>(s => s
.From(0)
.Size(1000)
.MatchAll()
.Query(q => q.QueryString(d => d
.Query(query)
)));
Try analyzing your fields with a stemming analyzer like snowball which will try it's best to reduce words to their root form. For example, books and booking => book, jumps and jumping => jump. etc... The algorithm behind it isn't perfect and will trip up on irregular words/plural forms, but for the most part it works very well (on most European languages).
You can apply different analyzers when you initially create your index, or on an existing index using the update mapping API. Either way, you'll have to reindex our documents to apply the new analysis.
Create index example using NEST:
client.CreateIndex("yourindex", c => c
...
.AddMapping<YourType>(m => m
.MapFromAttributes()
.Properties(ps => ps
.String(s => s.Name("fieldname").Analyzer("snowball"))
...
)
)
);
Update mapping example:
client.Map<YourType>(m => m
.MapFromAttributes()
.Index("yourindex")
.Properties(ps => ps
.String(s => s.Name("fieldname").Analyzer("snowball"))
...
)
);
Here's some really great info on algorithmic stemmers in The Definitive Guide.
U can use Fuzzy also...
var articles = client.Search<ProductResult>(s => s
.From(0)
.Size(1000)
.Query(q => q.(d => d
.Fuzzy(fz => fz.OnField("field").Value("book").MaxExpansions(2))
));
Related
i've been having some trouble regarding creating a query that searches through different fields.
I got the answers i wanted by creating several queries - but for the sake of performance - i want to do this in just one query, if possible.
I've tried setting the query up in several .Should clauses, but it seems that it searches for documents that matches every field, which i think is intended.
It looks like this;
.From(0)
.Sort(sort => sort
.Field("priority", SortOrder.Descending))
.Sort(sort => sort
.Ascending(a => a.ItemNumber.Suffix("keyword")))
.Sort(sort => sort
.Descending(SortSpecialField.Score))
.TrackScores(true)
.Size(25)
.Query(qe => qe
.Bool(b => b
.Should(m => m
.Match(ma => ma
.Boost(1.1)
.Field("itemnumber")
.Query(ItemNumber)
))
.Should(m => m
.Match(ma => ma
.Boost(1.1)
.Field("itemnumber2")
.Query(ItemNumber)))
.Should(m => m
.Match(ma => ma
.Boost(1.1)
.Field("ean")
.Query(ItemNumber)))
.Should(m => m
.Match(ma => ma
.Boost(1.1)
.Field("itemalias")
.Query(ItemNumber)))
)));
What i want it to do is; Search through the Itemnumber and see if a document matches, if not, search through the Itemnumber2 and so on.
Is there an efficient way to do this in just one query?
I believe the syntax for a should query with multiple parts should be an array of queries. Your way you would just add multiple separate should queries. What you want should probably look like this:
.Bool(b => b
.Should(
m => m
.Match(ma => ma
.Boost(1.1)
.Field("itemnumber")
.Query(ItemNumber)),
m => m
.Match(ma => ma
.Boost(1.1)
.Field("itemnumber2")
.Query(ItemNumber)),
m => m
.Match(ma => ma
.Boost(1.1)
.Field("ean")
.Query(ItemNumber)),
m => m
.Match(ma => ma
.Boost(1.1)
.Field("itemalias")
.Query(ItemNumber)))
More here
Have you tried using a MultiMatch query instead? https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html
This would allow you to search your documents as you would with a Match query, but you should be able to specify multiple fields to search in.
This would allow you to ditch the bool query.
Which one is more efficient considering speed?
This one:
var studentWithBatchName =
db.Student.AsNoTracking()
.Include(c => c.Department)
.Include(c => c.Department.Section)
.Include(c => c.Department.Section.Batch)
.Select(c => c.Name, c.Department.Section.Batch.Name);
or this one:
var studentWithBatchName =
db.Student.AsNoTracking()
.Include(c => c.Department.Section.Batch)
.Select(c => c.Name, c.Department.Section.Batch.Name);
The Include statement is just getting the data from the related datasources/tables. If you are saying Include(c => c.Department.Section.Batch) you get the .Department.Section.Batch values.
If you are using .Include(c => c.Department).Include(c => c.Department.Section).Include(c => c.Department.Section.Batch)
it would theoretically add 3 joins to the query. I don't know if .NET catches these circumstances, but I would consider using ONLY Include(c => c.Department.Section.Batch) when you only need this value.
Found .ProjectTo to be more efficient considering speed. Instead of using Include which acts like outer join and makes your query big putting load on the server one can use .ProjectTo<>
Can get .ProjectTo<> by
using AutoMapper.QueryableExtensions;
Eg. var response = await db.TableName.ProjectTo().ToListAsync();
I am trying to implement "Did you mean" feature like google in my windows desktop application.
I have created a POC which inserts "Name" and "Description" in my index say "MyIndex"
I am able to do full text search, but unable to do something like "Did you mean".
here is a code snipt that I found in NEST documentation and i am unable to understand it:
s => s
.Suggest(ss => ss
.Term("my-term-suggest", t => t
.MaxEdits(1)
.MaxInspections(2)
.MaxTermFrequency(3)
.MinDocFrequency(4)
.MinWordLength(5)
.PrefixLength(6)
.SuggestMode(SuggestMode.Always)
.Analyzer("standard")
.Field(p => p.Name)
.ShardSize(7)
.Size(8)
.Text("hello world")
)
.Completion("my-completion-suggest", c => c
.Contexts(ctxs => ctxs
.Context("color",
ctx => ctx.Context(Project.First.Suggest.Contexts.Values.SelectMany(v => v).First())
)
)
.Fuzzy(f => f
.Fuzziness(Fuzziness.Auto)
.MinLength(1)
.PrefixLength(2)
.Transpositions()
.UnicodeAware(false)
)
.Analyzer("simple")
.Field(p => p.Suggest)
.Size(8)
.Prefix(Project.Instance.Name)
)
.Phrase("my-phrase-suggest", ph => ph
.Collate(c => c
.Query(q => q
.Inline("{ \"match\": { \"{{field_name}}\": \"{{suggestion}}\" }}")
.Params(p => p.Add("field_name", "title"))
)
.Prune()
)
.Confidence(10.1)
.DirectGenerator(d => d
.Field(p => p.Description)
)
.GramSize(1)
.Field(p => p.Name)
.Text("hello world")
.RealWordErrorLikelihood(0.5)
)
)
what is color, doing here?
Also what is this " ctx => ctx.Context(Project.First.Suggest.Contexts.Values.SelectMany(v => v).First()"
and this ".Prefix(Project.Instance.Name)".
Am I on the right path?
Please help.
The did you mean feature is more likely Term suggester (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html)
Completion is autocompletion. When you write "so" in searchbox it will give you "sony", "soly" ..etc. So in this case you wont need this feature.
(https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html)
Phrase is advanced Term suggester, phrase gives you to choice pre-selected suggestions with a mapping as document says
(https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-phrase.html)
You need this snippet in nest
s => s
.Suggest(ss => ss
.Term("my-term-suggest", t => t
.MaxEdits(1)
.MaxInspections(2)
.MaxTermFrequency(3)
.MinDocFrequency(4)
.MinWordLength(5)
.PrefixLength(6)
.SuggestMode(SuggestMode.Always)
.Analyzer("standard")
.Field(p => p.Name)
.ShardSize(7)
.Size(8)
.Text("hello world")
))
Currently I wrote a query against an elastic server to remove all documents with an old "BatchVersion". After thinking about it, to be safe, I want all records delete that don't equal the current "BatchVersion". Here is my current code
_client.DeleteByQuery<Data.ElasticSearch.Employee>(s => s
.Index(indexName)
.Size(1000)
.Query(q => q.
Bool(b => b.
MustNot(mn => mn.
Match(m => m.Field("BatchVersion").
Query([newVersionId]))))));
When the code is run, no records are deleted. Any ideas?
I had to use Default_Field for it to work. I used kibana to figure it out.
_client.DeleteByQuery<Employee>(s => s
.Index(indexName)
.Size(1000)
.Query(q => q.
Bool(b => b.
MustNot(mn => mn.
QueryString(qs => qs.DefaultField("batchVersion").Query(newVersionId.ToString()))))));
I want to do a search matching multiple values ( an array of values ) like this :
var result1 = _client.Search<type1>(s => s
.Fields(f => f.trip_id)
.Query(q => q
.Terms(t => t.arg1, value1)).Take(_allData))
.Documents.Select(d => d.arg2).ToArray();
var result2 = _client.Search<type2>(s => s
.Query(q => q
.Terms(t => t.arg3, result1))
.Take(_allData)
).Documents.Select(s => s.ar3).ToList();
How can I do ? I was thinking about facets but I don't see how I can do it.
The only way for now that works is with a foreach iterator which is not really effective...
Thanks for your help.
You can express multiple queries like so:
.Query(q=>q.Terms(t=>t.arg3, result1) && q.Terms(t=>t.arg1, value1))
Be sure to read the documentation on writing queries to discover all the good stuff NEST has to offer.
Orelus,
I'd like to use your solution with
.And( af=>af.Term(...), af=>af.Term(...) )
I don't understand where this fits, here's an example of my non-working filter
var results = client.Search<music>(s => s
.Query(q => q
.Filtered(f => f.
Filter(b => b.Bool(m => m.Must(
t => t
.Term(p => p.artist, artist)
&& t.Term(p2 => p2.year, year)
)
)
)
)
)
);