I'm a complete beginner to elasticsearch and I have been trying to use elasticsearch's completion suggester using Nest for auto-complete on a property.
Here is my mapping (as mentioned here: ):
var createResult = client.CreateIndex(indexName, index => index
.AddMapping<Contact>(tmd => tmd
.Properties(props => props
.Completion(s =>
s.Name(p => p.CompanyName.Suffix("completion"))
.IndexAnalyzer("standard")
.SearchAnalyzer("standard")
.MaxInputLength(20)
.Payloads()
.PreservePositionIncrements()
.PreserveSeparators())
)
)
);
var resultPerson = client.IndexMany(documents.OfType<Person>(), new SimpleBulkParameters { Refresh = true });
var resultCompany = client.IndexMany(documents.OfType<Company>(), new SimpleBulkParameters { Refresh = true });
And while indexing I'm just making use of IndexMany method and passing the IEnumberable<Contact> (Contact has a property by name CompanyName, Contact is an abstract class, both Person and Company are concrete implementations of it). The search throws an exception saying ElasticSearchException[Field [companyName] is not a completion suggest field]. And the query looks like below:
SearchDescriptor<Contact> descriptor = new SearchDescriptor<Contact>();
descriptor = descriptor.SuggestCompletion("suggest", c => c.OnField(f => f.CompanyName).Text(q));
var result = getElasticClientInstance("contacts").Search<Contact>(body => descriptor);
string qe = result.ConnectionStatus.ToString();
What am I doing wrong here, I looked into Nest's tests on SuggestCompletion but not much help, meaning the test only depict on how to get suggestions but not on how to set index mappings for SuggestCompletion.
I also tried setting up edgeNgram tokenizer as mentioned in this post but, couldn't proceed there as well.
Any direction or an example on how to proceed would greatly help.
UPDATE
You are try to create a property with the name "companyName.completion" but at that position its not valid and it will use the last token "completion". So its actually mapping a field called completion.... try changing the call to: .Name(p => p.CompanyName)
Other observations
You specify a mapping for the Contact but while indexing you use the Person and Company types.
In elasticsearch terms you mapped:
/index/contact/
but your documents are going into:
/index/person/ and /index/company
NEST won't automatically map all implementation of a specific class and elasticsearch has no way of knowing the three are related.
I would refactor the mapping to a method and call it for all the types involved.
var createResult = client.CreateIndex(indexName, index => index
.AddMapping<Contact>(tmd => MapContactCompletionFields(tmd))
.AddMapping<Person>(tmd => MapContactCompletionFields(tmd))
.AddMapping<Company>(tmd => MapContactCompletionFields(tmd))
);
private RootObjectMappingDescriptor<TContact> MapContactCompletionFields<TContact>(
RootObjectMappingDescriptor<TContact> tmd)
where TContact : Contact
{
return tmd.Properties(props => props
.Completion(s => s
.Name(p => p.CompanyName.Suffix("completion"))
.IndexAnalyzer("standard")
.SearchAnalyzer("standard")
.MaxInputLength(20)
.Payloads()
.PreservePositionIncrements()
.PreserveSeparators()
)
);
}
That method returns the descriptor so you can further chain on it.
Then when you do a search for contacts:
var result = getElasticClientInstance("contacts").Search<Contact>(
body => descriptor
.Types(typeof(Person), typeof(Company))
);
That types hint will cause the search to looking /index/person and /index/company and will know how to give you back a covariant list of documents.
So you can do result.Documents.OfType<Person>() after the previous call.
Related
I have been reading up on Point in time API and wanted to implement it using NEST in my .net application. However, when reading that article (.net application hyperlink), I saw the Fluent DSL example as shown below. Is there a way to find that ID without having to go to kibana on the console and doing an query search to get the id to then later place that id inside of "a-point-in-time-id"? or does "a-point-in-time-id" do that for you like is that mapped to the ID?
s => s
.PointInTime("a-point-in-time-id", p => p
.KeepAlive("1m"))
I know in the kibana cli console if you do:
POST /app-logs*/_pit?keep_alive=5m it will give you a PIT (point in time) id. How do you go about retrieving that in NEST?
and when reading up on search_after and attempting to implement it using the search after usage for the .net client using the Fluent DSL Example. I noticed that they had they word "Project" but it does not say what "Project is in the example. What would that be exactly?
s => s
.Sort(srt => srt
.Descending(p => p.NumberOfCommits)
.Descending(p => p.Name)
)
.SearchAfter(
Project.First.NumberOfCommits,
Project.First.Name
)
Here I attempted to implement .PointInTime() and .Sort() and .SearchAfter() but got stuck.
var response = await _elasticClient.SearchAsync<Source>(s => s
.Size(3000) // must see about this
.Source(src => src.Includes(i => i
.Fields(f => f.timestamp,
fields => fields.messageTemplate,
fields => fields.message)))
.Index("app-logs*"
.Query(q => +q
.DateRange(dr => dr
.Field("#timestamp")
.GreaterThanOrEquals("2021-06-12T16:39:32.727-05:00")
.LessThanOrEquals(DateTime.Now))))
.PointInTime("", p => p
.KeepAlive("5m"))
.Sort(srt => srt
.Ascending(p => p.timestamp))
.SearchAfter()
I know when you are using PIT ID you do not need the index in the search, but in the hyperlink example it does not show how you would go about implementing that. So just a bit lost on how to do so. Any pointers/guidance/tutorials would be great!
But just trying to see how I can do this in NEST but if you are saying its apart of the XPACK, I would understand.
I was able to combine PIT and SearchAfter with the code sample below:
var pit = string.IsNullOrEmpty(pitId) ? client.OpenPointInTime(new OpenPointInTimeDescriptor().KeepAlive(ServerDetail.TTL).Index(Indices.Index(Name))) : null;
var request = new SearchRequest(Name)
{
SearchAfter = string.IsNullOrEmpty(pitId) ? null : new string[] { last },
Size = ServerDetail.PageSize,
PointInTime = new PointInTime(pit == null ? pitId : pit.Id)
};
List<FieldSort> sorts = new List<FieldSort>();
foreach (var item in sortBy)
{
sorts.Add(new FieldSort() { Field = item.Field, Order = item.Direction == SortDirection.Asc ? SortOrder.Ascending : SortOrder.Descending });
}
request.Sort = sorts.ToArray();
Your SearchAfter value should be values from your sort fields for the last object of your previous search result.
For the 1st search pitId is null so a new one is created, for subsequent requests, pitId is passed.
Hope that works for you?
I'm using Microsoft SQL Server Management Studio and ElasticSearch 2.3.4 with ElasticSearch-jdbc-2.3.4.1, and i linked ES with my mssql server. Everything works fine, but when i make a query using NEST on my MVC program the result is empty. When i put an empty string inside my search attribute i get the elements, but when i try to fill it with some filter i get an empty result. Can someone help me out please? Thanks in advance.
C#:
const string ESServer = "http://localhost:9200";
ConnectionSettings settings = new ConnectionSettings(new Uri(ESServer));
settings.DefaultIndex("tiky");
settings.MapDefaultTypeNames(map => map.Add(typeof(DAL.Faq), "faq"));
ElasticClient client = new ElasticClient(settings);
var response = client.Search<DAL.Faq>(s => s.Query(q => q.Term(x => x.Question, search)));
var result = response.Documents.ToList();
DAL:
Postman:
PS: i followed this guide to create it
EDIT:
Index mapping:
There's a couple of things that I can see that may help here:
By default, NEST camel cases POCO property names when serializing them as part of the query JSON in the request, so x => x.Question will serialize to "question". Looking at your mapping however, field names in Elasticsearch are Pascal cased, so what the client is doing will not match what's in Elasticsearch.
You can change how NEST serializes POCO property names by using .DefaultFieldNameInferrer(Func<string, string>) on ConnectionSettings
const string ESServer = "http://localhost:9200";
ConnectionSettings settings = new ConnectionSettings(new Uri(ESServer))
.DefaultIndex("tiky");
.MapDefaultTypeNames(map => map.Add(typeof(DAL.Faq), "faq"))
// pass POCO property names through verbatim
.DefaultFieldNameInferrer(s => s);
ElasticClient client = new ElasticClient(settings);
As Rob mentioned in the comments, a term query does not analyze the query input. When executing a term query against a field that is analyzed at index time then, in order to get matches, the query text that you pass to the term query would need to take the analysis that is applied at index time into account. For example,
Question is analyzed with the Standard Analyzer
A Question value of "What's the Question?" will be analyzed and indexed as tokens "what's", "the" and "question"
A term query would need to have a query input of "what's", "the" or "question" to be a match
A match query, unlike a term query, does analyze the query input, so the output of the search analysis will be used to find matches. In conjunction with Pascal casing highlighted in 1., you should now get documents returned.
You can also have the best of both worlds in Elasticsearch i.e. analyze input at index time for full-text search functionality as well as index input without analysis to get exact matches. This is done with multi-fields and here is an example of creating a mapping that indexes Question properties as both analyzed and not analyzed
public class Faq
{
public string Question { get; set; }
}
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "default-index";
var connectionSettings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex)
.DefaultFieldNameInferrer(s => s);
var client = new ElasticClient(connectionSettings);
if (client.IndexExists(defaultIndex).Exists)
client.DeleteIndex(defaultIndex);
client.CreateIndex(defaultIndex, c => c
.Mappings(m => m
.Map<Faq>(mm => mm
// let NEST infer mapping from the POCO
.AutoMap()
// override any inferred mappings explicitly
.Properties(p => p
.String(s => s
.Name(n => n.Question)
.Fields(f => f
.String(ss => ss
.Name("raw")
.NotAnalyzed()
)
)
)
)
)
)
);
The mapping for this looks like
{
"mappings": {
"faq": {
"properties": {
"Question": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
The "raw" sub field under the "Question" field will index the value of the Question property without any analysis i.e. verbatim. This sub field can now be used in a term query to find exact matches
client.Search<Faq>(s => s
.Query(q => q
.Term(f => f.Question.Suffix("raw"), "What's the Question?")
)
);
which find matches for the previous example.
Consider the following piece of code:
var results = collection.Aggregate()
...
.Lookup( ... )
.Project( ??? );
I need to call Project() on the above query. I haven't been able to figure out how to build a projection definition of type ProjectionDefinition<T1, T2>, which is what Project() requires.
The Builders class doesn't seem to work in this case:
var projection = Builders<Event>.Projection.Include(x => x).Include("agg_res.SomeField");
as it instantiates a definition of type ProjectionDefinition<T>.
I found the answer. An aggregation can perform a lookup and projection at the same time by using a different overload of Lookup():
var results = collection.Aggregate()
.Match(filter)
.Lookup<Event, User, AggregatedEvent>(users, e => e.OwnerId, u => u.Id, r => r.OwnerUsers)
.ToList();
This allows one to use lambdas to indicate which fields should be matched and where to place the join results (OwnerUsers in the above example). Note that AggregatedEvent extends Event and includes an array field called OwnerUsers. This is where the matches are placed.
Given the following linq-query:
var query1 = dbContext.MainTable.Where(m => m.MainId == _mainId).SelectMany(sub => sub.SubTable1)
.Select(sub1 => new
{
sub1.CategoryName,
VisibleDivisions = sub1.SubTable2
.Where(sub2 => sub2.Status == "Visible")
.Select(sub2 => new
{
/* select only what needed */
})
});
Starting from my main-table, I want to get all sub1's selected together with all the sub2's related to the sub1.
The query works as expected, generating a single query which will hit the database.
My question is regarding the inner Where-part, as of this filter will be used at several other parts in the application. So I would like to have this "visible-rule" defined at a single place (DRY-principle).
As of the Where is expecting an Func<SubTable2, bool> I have written the following property
public static Expression<Func<SubTable2, bool>> VisibleOnlyExpression => sub2 => sub2.Status == "Visible";
and changed my query to
var query1 = dbContext.MainTable.Where(m => m.MainId == _mainId).SelectMany(sub => sub.SubTable1)
.Select(sub1 => new
{
sub1.CategoryName,
VisibleDivisions = sub1.SubTable2
.Where(VisibleOnlyExpression.Compile())
.Select(sub2 => new
{
/* select only what needed */
})
});
This throws me an exception, stating Internal .NET Framework Data Provider error 1025..
I already tried changing to .Where(VisibleOnlyExpression.Compile()) with the same error.
I know that this is because EntityFramework is trying to transalte this into SQL which it can not.
My question is: How can I have my "filter-rules" defined at a single place (DRY) in code but have the still usable in Where-, Select-, ... -clauses which can be used on IQueryable as well as on ICollection for inner (sub-)queries?
I would love to be able to write something like:
var query = dbContext.MainTable
.Where(IsAwesome)
.SelectMany(s => s.SubTable1.Where(IsAlsoAwesome))
.Select(sub => new
{
Sub1sub2s = sub.SubTable2.Where(IsVisible),
Sub2Mains = sub.MainTable.Where(IsAwesome)
});
whereas the IsAwesome-rule is called first on IQueryable<MainTable> to get only awesome main-entries and later on ICollection<MainTable> in the sub-select to fetch only awesome main-entries related to a specific SubTable2-entry. But the rule - defining a MainTable-entry as awesome - will be the same, no matter where I call/filter for it.
I guess the solution will need the use of expression-trees and how they can be manipulated, so they will be translatable to plain SQL but I don't get the right idea or point to start with.
You can get something close to what are you asking for using the LinqKit AsExpandable and Invoke extension methods like this:
var isAvesome = IsAwesome;
var isAlsoAwesome = IsAlsoAwesome;
var isVisible = IsVisible;
var query = dbContext.MainTable
.AsExpandable()
.Where(mt => isAwesome.Invoke(mt))
.SelectMany(s => s.SubTable1.Where(st1 => isAlsoAwesome.Invoke(st1)))
.Select(sub => new
{
Sub1sub2s = sub.SubTable2.Where(st2 => isVisible.Invoke(st2)),
Sub2Mains = sub.MainTable.Where(mt => isAwesome.Invoke(mt))
});
I'm saying close because first you need to pull all the expressions needed into variables, otherwise you'll get the famous EF "Method not supported" exception. And second, the invocation is not so concise as in your wish. But at least it allows you to reuse the logic.
AFAIK what you are trying to do should be perfectly possible:
// You forgot to access ".Status" in your code.
// Also you don't have to use "=>" to initialize "IsVisible". Use the regular "=".
public static Expression<Func<SubTable2, bool>> IsVisible = sub2 =>
sub2.Status == "Visible";
...
VisibleDivisions = sub1
.SubTable2
// Don't call "Compile()" on your predicate expression. EF will do that.
.Where(IsVisibleOnly)
.Select(sub2 => new
{
/* select only what needed */
})
I would prepare extension method like below:
public static IQueryable<SubTable2> VisibleOnly(this IQueryable<SubTable2> source)
{
return source.Where(s => s.Status == "Visible");
}
An then you can use it in that way:
var query = dbContext.Table.VisibleOnly().Select(...)
I'm re-writing some of my old NHibernate code to be more database agnostic and use NHibernate queries rather than hard coded SELECT statements or database views. I'm stuck with one that's incredibly slow after being re-written. The SQL query is as such:
SELECT
r.recipeingredientid AS id,
r.ingredientid,
r.recipeid,
r.qty,
r.unit,
i.conversiontype,
i.unitweight,
f.unittype,
f.formamount,
f.formunit
FROM recipeingredients r
INNER JOIN shoppingingredients i USING (ingredientid)
LEFT JOIN ingredientforms f USING (ingredientformid)
So, it's a pretty basic query with a couple JOINs that selects a few columns from each table. This query happens to return about 400,000 rows and has roughly a 5 second execution time. My first attempt to express it as an NHibernate query was as such:
var timer = new System.Diagnostics.Stopwatch();
timer.Start();
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.Fetch(prop => prop.Ingredient).Eager()
.Fetch(prop => prop.IngredientForm).Eager()
.List();
timer.Stop();
This code works and generates the desired SQL, however it takes 120,264ms to run. After that, I loop through recIngs and populate a List<T> collection, which takes under a second. So, something NHibernate is doing is extremely slow! I have a feeling this is simply the overhead of constructing instances of my model classes for each row. However, in my case, I'm only using a couple properties from each table, so maybe I can optimize this.
The first thing I tried was this:
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => joinForm.FormDisplayName)
.List<String>();
Here, I just grab a single value from one of my JOIN'ed tables. The SQL code is once again correct and this time it only grabs the FormDisplayName column in the select clause. This call takes 2498ms to run. I think we're on to something!!
However, I of course need to return several different columns, not just one. Here's where things get tricky. My first attempt is an anonymous type:
.Select(r => new { DisplayName = joinForm.FormDisplayName, IngName = joinIng.DisplayName })
Ideally, this should return a collection of anonymous types with both a DisplayName and an IngName property. However, this causes an exception in NHibernate:
Object reference not set to an instance of an object.
Plus, .List() is trying to return a list of RecipeIngredients, not anonymous types. I also tried .List<Object>() to no avail. Hmm. Well, perhaps I can create a new type and return a collection of those:
.Select(r => new TestType(r))
The TestType construction would take a RecipeIngredients object and do whatever. However, when I do this, NHibernate throws the following exception:
An unhandled exception of type 'NHibernate.MappingException' occurred
in NHibernate.dll
Additional information: No persister for: KitchenPC.Modeler.TestType
I guess NHibernate wants to generate a model matching the schema of RecipeIngredients.
How can I do what I'm trying to do? It seems that .Select() can only be used for selecting a list of a single column. Is there a way to use it to select multiple columns?
Perhaps one way would be to create a model with my exact schema, however I think that would end up being just as slow as the original attempt.
Is there any way to return this much data from the server without the massive overhead, without hard coding a SQL string into the program or depending on a VIEW in the database? I'd like to keep my code completely database agnostic. Thanks!
The QueryOver syntax for conversion of selected columns into artificial object (DTO) is a bit different. See here:
16.6. Projections for more details and nice example.
A draft of it could be like this, first the DTO
public class TestTypeDTO // the DTO
{
public string PropertyStr1 { get; set; }
...
public int PropertyNum1 { get; set; }
...
}
And this is an example of the usage
// DTO marker
TestTypeDTO dto = null;
// the query you need
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
// place for projections
.SelectList(list => list
// this set is an example of string and int
.Select(x => joinForm.FormDisplayName)
.WithAlias(() => dto.PropertyStr1) // this WithAlias is essential
.Select(x => joinIng.Weight) // it will help the below transformer
.WithAlias(() => dto.PropertyNum1)) // with conversion
...
.TransformUsing(Transformers.AliasToBean<TestTypeDTO>())
.List<TestTypeDTO>();
So, I came up with my own solution that's a bit of a mix between Radim's solution (using the AliasToBean transformer with a DTO, and Jake's solution involving selecting raw properties and converting each row to a list of object[] tuples.
My code is as follows:
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(
p => joinIng.IngredientId,
p => p.Recipe.RecipeId,
p => p.Qty,
p => p.Unit,
p => joinIng.ConversionType,
p => joinIng.UnitWeight,
p => joinForm.UnitType,
p => joinForm.FormAmount,
p => joinForm.FormUnit)
.TransformUsing(IngredientGraphTransformer.Create())
.List<IngredientBinding>();
I then implemented a new class called IngredientGraphTransformer which can convert that object[] array into a list of IngredientBinding objects, which is what I was ultimately doing with this list anyway. This is exactly how AliasToBeanTransformer is implemented, only it initializes a DTO based on a list of aliases.
public class IngredientGraphTransformer : IResultTransformer
{
public static IngredientGraphTransformer Create()
{
return new IngredientGraphTransformer();
}
IngredientGraphTransformer()
{
}
public IList TransformList(IList collection)
{
return collection;
}
public object TransformTuple(object[] tuple, string[] aliases)
{
Guid ingId = (Guid)tuple[0];
Guid recipeId = (Guid)tuple[1];
Single? qty = (Single?)tuple[2];
Units usageUnit = (Units)tuple[3];
UnitType convType = (UnitType)tuple[4];
Int32 unitWeight = (int)tuple[5];
Units rawUnit = Unit.GetDefaultUnitType(convType);
// Do a bunch of logic based on the data above
return new IngredientBinding
{
RecipeId = recipeId,
IngredientId = ingId,
Qty = qty,
Unit = rawUnit
};
}
}
Note, this is not as fast as doing a raw SQL query and looping through the results with an IDataReader, however it's much faster than joining in all the various models and building the full set of data.
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => r.column1, r => r.column2})
.List<object[]>();
Would this work?