Perform includes on key object, using LINQ, in a GroupBy situation

Perform includes on key object, using LINQ, in a GroupBy situation - c#

I have a relatively simple, yet somehow weirdly complicated case whereby I need to perform includes on a lengthy object graph, when I'm doing a group-by.
Here is roughly what my LINQ looks like:
var result = DbContext.ParentTable
.Where(p => [...some criteria...])
.GroupBy(p => p.Child)
.Select(p => new
{
ChildObject = p.Key,
AllTheThings = p.Sum(p => p.SomeNumericColumn),
LatestAndGreatest = p.Max(p => p.SomeDateColumn)
})
.OrderByDescending(o => o.SomeTotal)
.Take(100)
.ToHashSet();
That gives me a listing of anonymous objects, just the way I want it, with child objects neatly associated with some aggregate stats about said object. Fine. But I also need a fair share of the object graph associated with child object.
This ask gets even a bit messier than it might otherwise be, when I want to use existing code, I already have to perform the includes. I.e., I have a static method which will take an IQueryable of my child object and, based upon parameters, give me back another IQueryable, with all the proper includes that I need (there are rather a lot of them).
I can't seem to figure the correct way to take my child object as a queryable, and give that to my include method, such that I get it back, for expansion at the point I want to express it to the new anonymous object (where I'm saying ChildObject = n.Key).
Sorry if this is something of a duplicate -- I did search around and found solutions that were close to what I'm wanting, here but not quite.

Related

How can I use a lambda function in the FindAsync() method?

I want to find objects that match a certain requirement. In this case, I want the className attribute of the Course class to be equal to "course". The statement I wanted to use was:
Course matchingCourse = collection.FindAsync(x => x.className == "course");
The className attribute is not the primary key, so I cannot simply put the key in.
From this I get the error "Cannot convert lambda expression to type 'object' because it is not a delegate type". I have looked around and I cannot figure out how to solve the issue in this context.
How can I use a lambda to do what I'm trying to do in FindAsync(), or is there another way?

You can use a Where followed by a FirstOrDefault or SingleOrDefault or just pass the predicate to those, unless you need the cache that Find gives you. This will call the database each time. Find only works on primary keys.
For example:
Course matchingCourse = collection.FirstOrDefault(x => x.className == "course");
For Async:
Course matchingCourse = await collection.FirstOrDefaultAsync(x => x.className == "course");

The main difference between Find and Select is that Find will first check whether the object that you want has been fetched before since you created the DbContext. Select will always fetch the data from the database.
In fact, Find is used, if you fetched an object, lost the reference to it, and want the fetched object again, not the original.
Usually it is not a good idea to keep the DbContext alive for a long time, so chances are very small that you loose the reference to the fetched object.
Therefore my advice would be to use Select to fetch the item:
(leaving out the async, not part of your problem)
var result = collection
.Where(collectionElement => collectionElement.ClassName == "course")
.FirstOrDefault();
If you really need to get the earlier fetched item, you can use DbContext.ChangeTracker.Entries<...>() and do a Where(...) on the already fetched objects

Manually assign existing object to Entity Framework navigation property

Is it possible to manually assign an existing object to an Entity Framework (db first) object's navigation property?
The context to the question is that I have a problem in trying to bring back a (heavily filtered) list of objects with all the children and descendants attached so that the full graph is available in memory after the context is disposed.
I tried doing this via .Include() statements using something like this:
using (var ctx = new MyEntities())
{
myParents = ctx.Parents
.Where(p => MyFilter(p))
.Include(p => p.Children)
.Include(p => p.Children.Select(c=>c.Grandchildren))
.Include(p => p.Children.Select(c=>c.Grandchildren.Select(g=>g.GreatGrandChildren)));
}
but the generated query runs too slowly because of the known performance problems with using nested include statements (as explained in lots of places including this blog).
I can pull back the parents, children and Grandchildren without performance issues - I only hit the troubles when I include the very last .Include() statement for the greatgrandchildren.
I can easily get the GreatGrandChildren objects back from the database with a second separate query by building a list of GrandChildrenIds from the GrandChildren already retrieved and doing something like:
greatGrandKids = ctx.GreatGrandChildren.Where(g=>ids.Contains(g.GrandChildId)).ToList();
but now, once I dispose of the context, I cannot do something like grandChildA.GreatGrandChildren without hitting the object context disposed exception.
I could have as many as a few thousand GrandChildren objects so I really want to avoid a round trip to the database to fetch the GreatGrandChildren for each one which rules out simply using .Load() on each GrandChild object, right?
I could feasibly work around this by either just looking up the required greatgrandchildren from greatGrandKids each time I needed them in my subsequent code or even by adding a new (non-mapped) Property such as .GreatGrandChildrenLocal to the GrandChild class and assigning them all up front but these both feel very kludgy & ugly. I'd MUCH prefer to find a way to just be able to access the existing .GreatGrandChildren navigation property on each GrandChild object.
Trying the obvious of assigning to the navigation property with something like this:
grandchild.GreatGrandChildren = greatGrandKids
.Where(g=>g.GrandChildId == grandChild.Id)
.ToList();
fails too when I then try to access grandchild.GreatGrandChildren (still giving the object disposed exception).
So my question is:
Is there a way I can assign the existing GreatGrandChdildren objects I have already retrieved from the database to the .GreatGrandChdildren navigation property on the GrandChild object in such a way as to make them available (only needed for read operations) after the context is disposed?
(Or indeed is there a different solution to the problem?)

If you disable proxy creation with:
ctx.Configuration.ProxyCreationEnabled = false;
then reading and writing from/to the navigation property works exactly as expected without trying to lazily load the entities and throwing the object disposed exception.
So we have something like:
using (var ctx = new MyEntities())
{
myParents = ctx.Parents
.Where(p => MyFilter(p))
.Include(p => p.Children)
.Include(p => p.Children.Select(c=>c.Grandchildren));
//skip the final GreatGrandChildren include statement
//get the associated grandchildren & their ids:
var grandKids = myParents.SelectMany(p=>p.Children)
.SelectMany(c=>c.Grandchildren)
.ToList();
var ids = grandKids.Select(g=>g.Id)).ToList();
//Get the great grandkids:
var greatGrandKids = ctx.GreatGrandChildren
.Where(g=>ids.Contains(g.GrandChildId)).ToList();
//Assign the greatgrandchildren to the grandchildren:
foreach (grandChild in grandKids)
{
grandChild.GreatGrandChildren = greatGrandKids
.Where(g=>g.GrandChildId == grandChild.Id)
.ToList();
}
}
and now we can access the the .GreatGrandChildren property outside the context without hitting the context disposed exception. Whilst this still feels a little messy, it works out MUCH cheaper than either using the original Include() statement or calling .Load() on each GrandChild.
N.B. As these objects are only used in read operations and I don't need Lazy Loading then there are no negative implications to turning off proxy creation in my circumstances. If write operations and/or lazy loading were also necessary then we would also need to consider the implications of turning this off for the given EF context.

Linq to DocumentDb, where clause on child

In a project i'm currently working on, we have come to realise that we should not use DocumentDb collections as if they are the equivalent of a table in f.ex SQL Server. As a result, we are now persisting all of the entities, belonging to a single tenant in a single collection.
We already have lots of linq queries in our codebase which assume that each document type (aggregate root) is persisted in a dedicated collection. In an attempt to make the transition painless, i set out to refactor our data access object, so that its api continues to reason about aggregate roots, and deal with the single collection vs dedicated collections in it's implementation.
My approach is to wrap an aggregate root in an Resource<T> object, which derives from Resource and exposes a Model property as well as a Type property. I thought i would then be able to expose an IQueryable<T> to consuming code based on the following code:
return _client.CreateDocumentQuery<Resource<TModel>>(_collection.DocumentsLink)
.Where(x => x.Type == typeof(TModel).Name)
.Select(x => x.Model);
Initial testing showed that this worked as planned and i confidently committed my changes. When doing functional testing however, we found that some queried models had all of their properties set to their default values (ie. null, 0, false, etc).
I can reproduce the problem with the following code:
var wrong = _client.CreateDocumentQuery<Resource<TModel>>(_collection.DocumentsLink)
.Where(x => x.Type == typeof(TModel).Name)
.Select(x => x.Model)
.Where(x => !x.IsDeleted)
.ToArray();
var correct = _client.CreateDocumentQuery<Resource<TModel>>(_collection.DocumentsLink)
.Where(x => x.Type == typeof(TModel).Name)
.Where(x => !x.Model.IsDeleted)
.Select(x => x.Model)
.ToArray();
The results of the above queries are not the same!!
Both queries return the same number of TModel instances.
Only the instances returned by the second example have their properties populated.
In order for my refactoring to be successful, i need wrong to be ... right :) Falling back to SQL is not an option as we value type safety of linq. Changing our approach to expose the Resource<T> objects would touch lots of code, as it requires all *.Property references to be substituted by *.Model.Property references.
It seems an issue with the linq provider that is part of the DocumentDb client.
We use Microsoft.Azure.DocumentDb version 1.4.1
Edit 2015-09-24
The generated SQL queries are:
correct: {"query":"SELECT VALUE root.Model FROM root WHERE ((root.Type = \"DocumentType\") AND (NOT root.Model.IsDeleted)) "}
wrong: {"query":"SELECT * FROM root WHERE ((root.Type = \"DocumentType\") AND (NOT root.Model.IsDeleted)) "}
Also, this issue has been reported (and picked up) on GitHub here: https://github.com/Azure/azure-documentdb-net/issues/58

This has been confirmed as a problem with the SDK. a fix has been checked in and will ship with the next SDK drop.
in the interim you can use SQL, or change where you place the WHERE clauses.

EntityFramework 5 filter an included navigation property

I would like to find a way using Linq to filter a navigation property to a subset of related entities. I know all answers around this subject suggest doing an anonymous selector such as:
query.Where(x => x.Users.Any(y => y.ID == actingUser.ID))
.Select(x => new
{
Event = x,
Discussions = x.Discussions.Where(actingUser.GenerateSecurityFilterFor<Domain.Discussion>())
})
.OrderBy(x => x.Discussions.Count())
.ThenBy(x => x.Event.Name);
However, this is significantly less than ideal due to the general nature of our query generation and also yields significantly horrific sql queries if you throw up profiler.
I would like to be able to accomplish something like:
query.Include(x => x.Discussions.Where(actingUser.GenerateSecurityFilterFor<Domain.Discussion>()))
.OrderBy(x => x.Discussions.Count())
.ThenBy(x => x.Name);
I realize that this is not supported in EF5 (or any version for that matter) but there has to be a way to accomplish constraining the result set through Linq without delving into anonymous type select statements.
I have attempted doing something to the tune of:
query.GroupJoin(discquqery,
x => x.ID,
x => x.Event.ID,
(evt, disc) => evt.Discussions = disc.Where(actingUser.GenerateSecurityFilterFor<Domain.Discussion>())).ToList();
However you cannot have assignment inside a lambda expression and selecting an anonymous type here causes the same dilemma that it does using the select.
I guess I cannot comprehend why EF does not provide a way (that I can find) to generate:
SELECT
--Properties
FROM Event e
LEFT OUTER JOIN Discussions d
ON e.ID = d.EventID AND --Additional constraints
WHERE
--Where conditions
ORDER BY
--Order Conditions
It is so simple to constrain the join in SQL there HAS to be a way to do it through Linq as well.
PS: I have searched stack, MSDN, experts-exchange, etc. Please realize this is not a duplicate. Anything even touching on this subject either has a cop-out "It can't be done" answer or no answer at all. Nothing is impossible... including this.

Anything even touching on this subject either has a cop-out "It can't
be done" answer or no answer at all. Nothing is impossible...
including this.
Sure. It is possible. You can download EF source code and add this feature yourselves. It will be great contribution to open source project and the community. I believe EF team will gladly help you with your effort.
With the current version "it can't be done" is the answer. You can either use projection to anonymous or special unmapped type as you have described in the beginning of your question. Other options are separate explicit query to load related entities for single parent or separate query to load related entities for all parents.
Load relations for single parent:
context.Entry(event)
.Collection(e => e.Discussions)
.Query()
.Where(d => ...)
.Load();
Load relations for all parents (requires lazy loading to be turned off):
// load all parents
var events = query.Where(e => ...).ToList();
// load child filtered by same condition for parents and new condition for children
childQuery.Where(d => e.Event ... && d.Something ...).Load();
The second solution requires child to have navigation property back to parent (for constructing same query condition used initially to loads parent). If you have everything correctly configured and entities are attached EF should automatically fix your relations (collections) in parent entities (but it will not mark collection in dynamic proxy as loaded so that is the reason why you cannot use this together with lazy loading).

Fast queryable collection of objects

I am looking for a library that would accept a collection of objects and return an indexed data structure that would be optimised for fast querying.
This is probably better illustrated by an example:
public class MyClass
{
public sting Name {get;set;}
public double Number {get;set;}
public ... (Many more fields)
}
var dataStore = Indexer.Parse(myClassCollection).Index(x => x.Name).Index(x => x.Number).Index( x => x.SomeOtherProperty);
var queryResult = dataStore.Where( x => x.Name == "ABC").Where(x => x.Number == 23).Where( x => x.SomeOtherProperty == dateTimeValue);
The idea is that the query on the dataStore will be very fast, of the order of O(log n).
Using dictionaries of dictionaries starts getting complicated when you have more than 2 or 3 fields you want to index.
Is there a library that already exists that does something like this?

What about an object oriented database.
Sterling is a recommended option. It supports LINQ to Object so don't worry about queries and we have used it for a couple of medium projects with good results (it's pretty fast).

You should take a look at RaptorDB as well. Several versions, including a fully embedded version, can be found on CodeProject here.

You could use Lucene.NET which can also run fully in memory (though I'm not sure that's what you'd want). It supports lightning fast retrieval of documents based on field criteria.
So that actually gives you a document database. If you take that one step further, you end up with something like RavenDB (commercial).

I am wondering whether we could achieve this by creating a SortedDictionary for each of the indexed properties.
SortedDictionary<property, List<MyClass>>
Then parsing the Linq expression tree to find out which properties are being queried. We can retrieve the valid keys of the sortedDictionaries, and then loop through these keys to get a List for each sorted dictionary and then use Set operations such as Union() and Intersect() depending on whether the expression tree has OR or AND directives.
Then return the a List matching the search criteria.
If the query includes a property that is not indexed, execute the query with indexed properties first and then use normal Linq to finish it off.
The interesting bit then becomes parsing the expression tree.
Any thoughts on this approach?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Perform includes on key object, using LINQ, in a GroupBy situation - c#

Related

How can I use a lambda function in the FindAsync() method?

Manually assign existing object to Entity Framework navigation property

Linq to DocumentDb, where clause on child

EntityFramework 5 filter an included navigation property

Fast queryable collection of objects

Categories

Resources