Aggregation, suggestion in ElasticSearch (Nest, Elasticsearch.net) get complete object - c#

I'm quite new to elasticsearch, I am using NEST to query to elastic following is my code snippet.
var searchResults =
elasticClient.Client.Search<T>(
s => s
.Size(20)
.Fields(core)
.QueryString(String.Format("*{0}*", query)).MinScore(1).QueryString(String.Format("*{0}*", query.Replace(" ", "")))
.Highlight(h => h
.PreTags("<b>")
.PostTags("</b>")
.OnFields(f => f
.PreTags("<em>")
.PostTags("</em>")
)
)
);
var suggestResults = elasticClient.Client.Suggest<T>(s => s
.Term("suggest", m => m
.SuggestMode(SuggestMode.Always)
.Text(query)
.Size(10)
.OnField(core)
));
var aggregation = elasticClient.Client.Search<T>(s => s
.Aggregations(a => a
.Terms("term_items", gh=>gh
.Field(p=>p.Town)
.Aggregations(gha=>gha
.SignificantTerms("bucket_agg", m => m
.Field(p => p.Town)
.Size(2)
.Aggregations(ma => ma.Terms("Town", t => t.Field(p => p.Town)))
)
)
)
)
);
I do get list of documents (list of my specified domain object) , but in case of suggest and aggregation it doesn't return domain object ?
I apologize in advanced and I hope you can point me to the correct direction.
I am looking for a way to implement in NEST.

To get to your aggregations you need to use the Aggs property of the result. According to the documentation:
The result of the aggregations are accessed from the Aggs property of the response using the key that was specified on the request...
In your example this would be "term_items". You are also doing a sub-aggregation, so these need to be extracted for each top-level aggregation, again using the key specified for the sub-aggregation - "bucket_agg". Your code should look something like
var agg = results.Aggs.Terms("term_items");
if (agg!= null && agg.Items.Count > 0)
{
foreach (var termItemAgg in agg.Items)
{
// do something with the aggregations
// the term is accessed using the Key property
var term = termItemAgg.Key;
// the count is accessed through the DocCount property
var count = termItemAgg.Count;
// now access the result of the sub-aggregation
var bucketAgg = termItemAgg.Aggs.SignificantTerms("bucket_agg");
// do something with your sub-aggregation results
}
}
There is more detail in the documentation.
To get your suggestions you access the Suggestions property of your results object, using the key you specify when calling the ElasticClient.Suggest method. Something like
var suggestions = result.Suggestions["suggest"]
.FirstOrDefault()
.Options
.Select(suggest => suggest.Payload)
.ToList();
Hope this helps .

Related

MongoDB GroupBy Aggregate and Count Documents C#

I have a requirement to get the count of documents based on status of customer. So I need to use aggregate function and then group by based on status. I have used following code for that but the problem is that in Result I am getting the list of documents but what I just want to have is the status and count of documents under that. Can any body please help in adjusting the query to achieve the results.
var result = collection.Aggregate()
.Group(
x => x.status,
g => new
{
Result = g.Select(x => new CustomerDetailsList
{
ActiveType = x.status,
Count = g.Count()
}
)
}
);
Thanks in advance
The reason you're getting a list of documents for every key is that you're running this nested Select, all you need is:
collection.Aggregate()
.Group(
x => x.status,
g => new CustomerDetailsList
{
ActiveType = g.Key,
Count = g.Count()
}).ToList();
I am satisfied with the answer of #mickl and it works well as I tested according to my requirement but here is the way I opted in my app as this is what I am comfortable with. The method is to use the collection as queryable
var result = collection.AsQueryable()
.GroupBy(x => x.status)
.Select(x => new CustomerDetailsList
{
ActiveType = x.Key, Count = x.Count()
}).ToList();
I have used more LINQ in this way so I choose this as it's better to understand for me.
You can choose any of the methods either this or as demonstrated by #mickl

Problem with LINQ query: Select first task from each goal

I'm looking for suggestions on how to write a query. For each Goal, I want to select the first Task (sorted by Task.Sequence), in addition to any tasks with ShowAlways == true. (My actual query is more complex, but this query demonstrates the limitations I'm running into.)
I tried something like this:
var tasks = (from a in DbContext.Areas
from g in a.Goals
from t in g.Tasks
let nextTaskId = g.Tasks.OrderBy(tt => tt.Sequence).Select(tt => tt.Id).DefaultIfEmpty(-1).FirstOrDefault()
where t.ShowAlways || t.Id == nextTaskId
select new CalendarTask
{
// Member assignment
}).ToList();
But this query appears to be too complex.
System.InvalidOperationException: 'Processing of the LINQ expression 'OrderBy<Task, int>(
source: MaterializeCollectionNavigation(Navigation: Goal.Tasks(< Tasks > k__BackingField, DbSet<Task>) Collection ToDependent Task Inverse: Goal, Where<Task>(
source: NavigationExpansionExpression
Source: Where<Task>(
source: DbSet<Task>,
predicate: (t0) => Property<Nullable<int>>((Unhandled parameter: ti0).Outer.Inner, "Id") == Property<Nullable<int>>(t0, "GoalId"))
PendingSelector: (t0) => NavigationTreeExpression
Value: EntityReferenceTask
Expression: t0
,
predicate: (i) => Property<Nullable<int>>(NavigationTreeExpression
Value: EntityReferenceGoal
Expression: (Unhandled parameter: ti0).Outer.Inner, "Id") == Property<Nullable<int>>(i, "GoalId"))),
keySelector: (tt) => tt.Sequence)' by 'NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core. See https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed information.'
The problem is the line let nextTaskId =.... If I comment out that, there is no error. (But I don't get what I'm after.)
I'll readily admit that I don't understand the details of the error message. About the only other way I can think of to approach this is return all the Tasks and then sort and filter them on the client. But my preference is not to retrieve data I don't need.
Can anyone see any other ways to approach this query?
Note: I'm using the very latest version of Visual Studio and .NET.
UPDATE:
I tried a different, but less efficient approach to this query.
var tasks = (DbContext.Areas
.Where(a => a.UserId == UserManager.GetUserId(User) && !a.OnHold)
.SelectMany(a => a.Goals)
.Where(g => !g.OnHold)
.Select(g => g.Tasks.Where(tt => !tt.OnHold && !tt.Completed).OrderBy(tt => tt.Sequence).FirstOrDefault()))
.Union(DbContext.Areas
.Where(a => a.UserId == UserManager.GetUserId(User) && !a.OnHold)
.SelectMany(a => a.Goals)
.Where(g => !g.OnHold)
.Select(g => g.Tasks.Where(tt => !tt.OnHold && !tt.Completed && (tt.DueDate.HasValue || tt.AlwaysShow)).OrderBy(tt => tt.Sequence).FirstOrDefault()))
.Distinct()
.Select(t => new CalendarTask
{
Id = t.Id,
Title = t.Title,
Goal = t.Goal.Title,
CssClass = t.Goal.Area.CssClass,
DueDate = t.DueDate,
Completed = t.Completed
});
But this also produced an error:
System.InvalidOperationException: 'Processing of the LINQ expression 'Where<Task>(
source: MaterializeCollectionNavigation(Navigation: Goal.Tasks (<Tasks>k__BackingField, DbSet<Task>) Collection ToDependent Task Inverse: Goal, Where<Task>(
source: NavigationExpansionExpression
Source: Where<Task>(
source: DbSet<Task>,
predicate: (t) => Property<Nullable<int>>((Unhandled parameter: ti).Inner, "Id") == Property<Nullable<int>>(t, "GoalId"))
PendingSelector: (t) => NavigationTreeExpression
Value: EntityReferenceTask
Expression: t
,
predicate: (i) => Property<Nullable<int>>(NavigationTreeExpression
Value: EntityReferenceGoal
Expression: (Unhandled parameter: ti).Inner, "Id") == Property<Nullable<int>>(i, "GoalId"))),
predicate: (tt) => !(tt.OnHold) && !(tt.Completed))' by 'NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core. See https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed information.'
This is a good example for the need of full reproducible example. When trying to reproduce the issue with similar entity models, I was either getting a different error about DefaulIfEmpty(-1) (apparently not supported, don't forget to remove it - the SQL query will work correctly w/o it) or no error when removing it.
Then I noticed a small deeply hidden difference in your error messages compared to mine, which led me to the cause of the problem:
MaterializeCollectionNavigation(Navigation: Goal.Tasks (<Tasks>k__BackingField, DbSet<Task>)
specifically the DbSet<Task> at the end (in my case it was ICollection<Task>). I realized that you used DbSet<T> type for collection navigation property rather than the usual ICollection<T>, IEnumerable<T>, List<T> etc., e.g.
public class Goal
{
// ...
public DbSet<Task> Tasks { get; set; }
}
Simply don't do that. DbSet<T> is a special EF Core class, supposed to be used only from DbContext to represent db table, view or raw SQL query result set. And more importantly, DbSets are the only real EF Core query roots, so it's not surprising that such usage confuses the EF Core query translator.
So change it to some of the supported interfaces/classes (for instance, ICollection<Task>) and the original problem will be solved.
Then removing the DefaultIfEmpty(-1) will allow successfully translating the first query in question.
I don't have EF Core up and running, but are you able to split it up like this?
var allTasks = DbContext.Areas
.SelectMany(a => a.Goals)
.SelectMany(a => a.Tasks);
var always = allTasks.Where(t => t.ShowAlways);
var next = allTasks
.OrderBy(tt => tt.Sequence)
.Take(1);
var result = always
.Concat(next)
.Select(t => new
{
// Member assignment
})
.ToList();
Edit: Sorry, I'm not great with query syntax, maybe this does what you need?
var allGoals = DbContext.Areas
.SelectMany(a => a.Goals);
var allTasks = DbContext.Areas
.SelectMany(a => a.Goals)
.SelectMany(a => a.Tasks);
var always = allGoals
.SelectMany(a => a.Tasks)
.Where(t => t.ShowAlways);
var nextTasks = allGoals
.SelectMany(g => g.Tasks.OrderBy(tt => tt.Sequence).Take(1));
var result = always
.Concat(nextTasks)
.Select(t => new
{
// Member assignment
})
.ToList();
I would recommend you start by breaking up this query into individual parts. Try iterating through the Goals in a foreach with your Task logic inside. Add each new CalendarTask to a List that you defined ahead of time.
Overall breaking this logic up and experimenting a bit will probably lead you to some insight with the limitations of Entity Framework Core.
I think we might separate the query into two steps. First, query each goals and get the min Sequence task and store them(maybe with a anonymous type like {NextTaskId,Goal}). Then, we query the temp data and get the result. For example
Areas.SelectMany(x=>x.Goals)
.Select(g=>new {
NextTaskId=g.Tasks.OrderBy(t=>t.Sequence).FirstOrDefault()?.Id,
Tasks=g.Tasks.Where(t=>t.ShowAlways)
})
.SelectMany(a=>a.Tasks,(a,task)=>new {
NextTaskId = a.NextTaskId,
Task = task
});
I tried to create the linq request but I'm not sure about the result
var tasks = ( from a in DbContext.Areas
from g in a.Goals
from t in g.Tasks
join oneTask in (from t in DbContext.Tasks
group t by t.Id into gt
select new {
Id = gt.Key,
Sequence = gt.Min(t => t.Sequence)
}) on new { t.Id, t.Sequence } equals new { oneTask.Id,oneTask.Sequence }
select new {Area = a, Goal = g, Task = t})
.Union(
from a in DbContext.Areas
from g in a.Goals
from t in g.Tasks
where t.ShowAlways
select new {Area = a, Goal = g, Task = t});
I currently don't have EF Core, but do you really need to compare this much?
Wouldn't querying the tasks be sufficient?
If there is a navigation property or foreign key defined I could imaging using something like this:
Tasks.Where(task => task.Sequence == Tasks.Where(t => t.GoalIdentity == task.GoalIdentity).Min(t => t.Sequence) || task.ShowAlways);

Filter dataset in repository design pattern C#

I have a given below query, which is perfect
SELECT
MAX(WarehouseId) AS WareHouseId,
MAX(CompanyId) AS CompanyId,
MAX(ProductId) AS ProductId,
SUM(AvailableQuantity) AS AvailableQuantity,
PurchaseItemPrice
FROM
PurchaseItem
WHERE
CompanyId = 1
GROUP BY
PurchaseItemPrice
ORDER BY
MAX(ProductId) ASC
Which I needs to convert into given below format instead of linq. I really don't know what is the name of given below format. Please also tell me what we can call this format and if this is not a good approach to get data please suggest something better because I'm new to repository design pattern.
unitOfWork.PurchaseItemRepository.DataSet
.Where( x => x.CompanyId == id )
.ToList()
.GroupBy( x => x.PurchaseItemPrice )
.Select( x =>
x.Max( y => new
{
y.WarehouseId,
y.CompanyId,
y.ProductId,
y.AvailableQuantity
} )
);
public IRepository<PurchaseItem> PurchaseItemRepository
{
get
{
if (_PurchaseItemRepository == null)
{
dbContext.Configuration.ProxyCreationEnabled = false;
_PurchaseItemRepository = new Repository<PurchaseItem>(dbContext);
}
return _PurchaseItemRepository;
}
}
And ProductItem is an Entity
Moreover, when I executed above code, It will display given below error.
An exception of type 'System.ArgumentException' occurred in mscorlib.dll but was not handled in user code. Additional information: At least one object must implement IComparable.
You need to have a way to determine the max. What are you doing max on how is the compiler going to know what property you want to use max on?
you probably want to grab the MAX on each one of those properties individually.
I.e
unitOfWork.PurchaseItemRepository.DataSet
.Where(x => x.CompanyId == id).ToList()
.GroupBy(x => x.PurchaseItemPrice)
.Select(x => new {
WarehouseId = x.OrderByDescending(v => v.WarehouseId).FirstOrDefault(),
CompanyId = x.OrderByDescending(v => v.CompanyId).FirstOrDefault(),
//Do the rest here
}).OrderBy( output => output.ProductId )
.ToList();
As a side note you probably shouldn't be calling ToList() this will materialize the whole table. You should work with IQuerables instead so you can filter without pulling the whole table
(BTW, you do not need to use the Repository (anti-)pattern with Linq and Entity Framework because the DbContext is the repository. Note that creating classes to hold predefined Linq queries is not the same thing as a Repository. I'm also not convinced that the Unit-of-Work Pattern is of much use with Entity Framework either as a short-lived DbContext in a using block with a Transaction also fulfils that role)
var list = dbContext.PurchaseItems
.Where( x => x.CompanyId == 1 )
.GroupBy( x => x.PurchaseItemPrice )
.Select( grp => new {
PurchaseItemPrice = grp.Key,
WarehouseId = grp.Max( x => x.WarehouseId ),
CompanyId = grp.Max( x => x.CompanyId ),
ProductId = grp.Max( x => x.ProductId ),
AvailableQuantity = grp.Sum( x => x.AvailableQuantity ),
} )
.OrderBy( output => output.ProductId )
.ToList();
list will need to use inferred type (var) because of the use of an anonymous-type in the .Select call. If you change it to a custom record-type then you can use List<MyOutputRecordType> instead.

How to remove duplicates from SQLite DB - using ENtity and LINQ

I have a DB, which containes a few fields. I want to remove duplicates based on one field ("full") - i.e. if there are more than one version of it, I should take any/first of them, and discard the rest...
So far I can't - everything throws an error of some kind.
This is one of my tires. Unfortunately the last Select in distinctList throws an error.
using (var context = new JITBModel())
{
var allList = context.BackupEvents.Select(i => i.Id).ToList();
var distinctList = context.BackupEvents
.GroupBy(x => x.Full)
.Select(i => i.ToList())
.Where(c => c.Count > 1)
.Select(t => t[0].Id).ToList();
var dups = allList.Except(distinctList);
context.BackupEvents.RemoveRange(from e in context.BackupEvents
where dups.Contains(e.Id)
select e);
context.SaveChanges();
}
Also, can't seem to choose .First() within a select query.
UPDATE: for now I implemented a simple ExecuteSqlCommand based on the answer here.
string com = #"DELETE FROM BackupEvents
WHERE rowid NOT IN (
SELECT MIN(rowid)
FROM BackupEvents
GROUP BY full)";
context.Database.ExecuteSqlCommand(com);
If anyone knows how to do it with entity/linq - let me know :-)
instead of t=> t[0].Id, try t.FirstOrDefault().Id.
Maybe code below would work ? I didn't run it, but I'm not getting any pre-compile error using something similar to below.
using (var context = new JITBModel())
{
var duplicates= context.BackupEvents
.GroupBy(x => x.Full)
.Where(grp => grp.Count() > 1)
.Select(grp=>grp.FirstOrDefault());
context.BackupEvents.RemoveRange(duplicates);
context.SaveChanges();
}

NHibernate - selecting data from referenced objects (traversing many-to-one relationship) in a query

i want to build a query that will select some columns from a joined table (many to one relationship in my data model).
var q = ses.QueryOver<Task>().Select(x => x.Id, x => x.CreatedDate, x => x.AssigneeGroup.Name, x => x.AssigneeGroup.IsProcessGroup);
Here i'm retrieving properties from AssigneeGroup which is a reference to another table, specified in my mapping. But when I try to run this query I get
Exception: could not resolve property: AssigneeGroup.Name of: Task
So it looks like NHibernate is not able to follow relations defined in my mapping and doesn't know that in order to resolve AssigneeGroup.Name we should do a join from 'Task' to 'Groups' table and retrieve Group.Name column.
So, my question is, how to build such queries? I have this expression: x => x.AssigneeGroup.Name, how to convert it to proper Criteria, Projections and Aliases? Or is there a way to do this automatically? It should be possible because NHibernate has all the information...
Your query need association and should look like this:
// firstly we need to get an alias for "AssigneeGroup", to be used later
AssigneeGroup assigneeGroup = null;
var q = ses
.QueryOver<Task>()
// now we will join the alias
.JoinAlias(x => x.AssigneeGroup, () => assigneeGroup)
.Select(x => x.Id
, x => x.CreatedDate
// instead of these
// , x => x.AssigneeGroup.Name
// , x => x.AssigneeGroup.IsProcessGroup
// use alias for SELECT/projection (i.e. ignore "x", use assigneeGroup)
, x => assigneeGroup.Name
, x => assigneeGroup.IsProcessGroup
);
More and interesting reading:
NHibernate - CreateCriteria vs CreateAlias, to get more understanding when to use JoinAlias (CreateAlias) and when JoinQueryOver (CreateCriteria)
Criteria API for: 15.4. Associations
QueryOver API for 16.4. Associations
You have to join the two tables if you wish to select columns from something other than the root table/entity (Task in our case).
Here is an example:
IQueryOver<Cat,Kitten> catQuery =
session.QueryOver<Cat>()
.JoinQueryOver<Kitten>(c => c.Kittens)
.Where(k => k.Name == "Tiddles");
or
Cat catAlias = null;
Kitten kittenAlias = null;
IQueryOver<Cat,Cat> catQuery =
session.QueryOver<Cat>(() => catAlias)
.JoinAlias(() => catAlias.Kittens, () => kittenAlias)
.Where(() => catAlias.Age > 5)
.And(() => kittenAlias.Name == "Tiddles");
Alternatively you could use the nhibernate linq provider (nh > 3.0):
var q = ses.Query<Task>()
.Select(x => new
{
Id = x.Id,
CreatedDate = x.CreatedDate,
Name = x.AssigneeGroup.Name,
IsProcessGroup = x.AssigneeGroup.IsProcessGroup
});

Categories

Resources