Embedded RavenDb Query on Index - c#

I am playing around with RavenDb I have a very simple class that contains a collection, I am trying to return all the objects where the contained collection has more than 1 record, cannot seem to make it work.
Note: I am using an In-Memory Embedded document store in LinqPad, reading some data from a RDBMS and inserting into the In-Memory store (this works, if I just Query<Agency>().Take(100) I see my records...
Any Idea's ?
below image just to show that the db does contain my data...

ok, I have figured it out, can't say I fully understand it...but...
PopulateRavenInMemory();
DatabaseCommands.PutIndex("MultipleAddresses",
new IndexDefinitionBuilder<Agency>
{
Map = agencies => from a in agencies
where a.Addresses.Count() > 1
select new {}
});
Query<Agency>("MultipleAddresses").Customize(x => x.WaitForNonStaleResultsAsOfNow()).Dump();
I understand the WaitForNonStaleResults call, that makes sense, but I don't really understand why my Map function cannot select the class, it seems to demand a projection, I can move on, but I hate not knowing why this is so.

Related

Filtering/Sorting computed fields in Hot Chocolate

In my application I have the following DTO object which retrieves data, via EF Core, from SQL and computes a certain field:
public class MyDTO
{
public string MyDTOProperty { get; set ; }
public string MyDTOComputedField(){
...
}
}
My query method looks like:
public class MyQueries
{
...
[UseDbContext(typeof(ApiDbContext))]
[UseFiltering(typeof(MyFilter))]
[UseSorting]
public IQueryable<MyDTO> GetObject([ScopedService] ApiDbContext context){
var query = context.MyDB;
return query.Select(fea => new MyDTO(){
MyDTOProperty = fea.property
});
}
}
Filtering and sorting only seems to work on the properties with get and set method. My question is, how can I enable filtering and sorting on my computed fields such that the following GraphQL query would be possible:
{
Object(where: {MyDTOComputedField: {contains: "someSubString"}}, order: {MyDTOComputedField: ASC}){
MyDTOProperty
MyDTOComputedField
}
}
I already tried with defining my own filtering/sorting middleware, without any luck so far.
TL;DR; You are using Custom Resolver (HC feature), not Computed Column (T-SQL feature), which could not be translated to SQL by Entity Framework.
First thing first, this is not a Hot Chocolate problem, but Entity Framework problem.
[UseFiltering]
Use filtering is not a magic nor golden bullet. It is only middleware, which will generate argument where for your endpoint and then, at runtime, it will take this argument (in your case {MyDTOComputedField: {contains: "someSubString"}}), make Linq Expression from it and return input.Where(Expression).
And thats pretty much it.
(Of course, if you ever wrote string -> linq expression piece of code then you know, its not THAT simple, but good folks from HC did exactly that for us :) )
Something like
System.Linq.Expression<Func<MyDTO, bool>> where =
myDto => myDto.MyDTOComputedField.Contains("someSubString");
return input.Where(where);
(remember, every middleware in HC is just a piece of pipe - it have input, some process and output. Thats why order of middlewares matters. Btw, same with "order by", but it will return input.OrderBy(expression))
Now, because input is DbSet<MyDTO>, then nothing is executed "right away" but lazily - real work is done by Entity Framework - it take linq Expresion (.Where().Sort()), translate it to T-SQL and send it as query.
And there is your problem: Your MyDTO.MyDTOComputedField is not translateable to SQL.
Why its not translateable?
Because your MyDTOComputedField is not "computed column" but "custom resolver". It exists only in your app and SQL have no idea what it should contains. Maybe it is something trivial as a + b * 42 (then computed column would be great!) but maybe it is request to another server REST api (why not :) ) - we dont know.
Then why not execute part of query on server and rest locally?
Because this scale reeeeeeeeally badly. You did not show us implementation of MyDTO.MyDTOComputedField, so let assume it do something trivial. Like cast((a + b * 42) as nvarchar(max));. Meaning, it will always be some int but casted as nvarchar. Meaning, if you ask for Contains("someSubString") it will always have 0 results.
Ok, now imagine, your MyDTO table (btw, I expect MyDTO to be EF model even with DataTransferObject in name...) have 10.000.000 rows (in enterprise scale app its bussiness as usual :) ).
Because you are sane person (and because it will make this example much better to understand :) ), you add pagination. Lets say 100 items per page.
In this example, you expect EF to do select top 100 * from MyDto where MyDTOComputedField like '%someSubString%'.
But thats not gonna happen - sql have no idea what MyDTOComputedField is.
So it have two options, both bad: It will execute select top 100, then do filter locally - but there is zero result. So it will take another 100 and another 100 and another and another and (10.000.000/100 = 100.000 select query!) only to found that there is 0 result.
Another possible solution is, when EF found that some part of expression have to be executed locally, it will execute locally whole query. So it will select, fetch, materialise 10.000.000 entities in one go and THEN it will filter them just to see there is 0 result. Somehow better, but still bad.
You just DDoS yourself.
Btw, Option 2 was what Entity Framework before core (Classic?) did. And it was source of soooo much bugs, when you accidentally fetched whole table, that good folks from EF team drop support for it and now they throw
"The LINQ expression 'DbSet()\n .Where(f => new MyDTO{ \r\n id = f.i, \r\n }\r\n.MyDTOProperty == __p_3' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See go.microsoft.com/fwlink/?linkid=2101038 for more information."
Ok... But what to do?
I know there will never be many rows.
Maybe MyDTO is just some list which will never explode (lets say, for example, VAT rates - there is pretty much Standard, Zero, Reduced + some states have some more. So we hardly look at table greater then ~5 rows, more when your app is international, but still - a few.)
Then you dont have to be afraid of local execution. Just add ".ToArray()" or ".ToList()" on the end of your DbSet call. As your Exception told you.
But be aware that this can really bite you later, if not done carefully.
Computed Column
If your implementation of MyDTOComputedField is trivial, you can move it to database. Set EF ComputedColumn, do migration, drop your resolver and you are ready to go.
Database View
Another possible option is to make database view.
This is more robust solution then Computed Column (at least, you can optimalise your view really well (custom index(es), better joins, no inner query etc...), but it take more work & you have to know what you are doing. AFAIK EF cant generate view for you, you have to write it by hand.
Just make empty migration, add your view, EF entity (make sure to use ToView() and not ToTable()), drop your resolver and you are ready to go.
In both cases, your query (dto?) model will be different from mutation (domain?) model, but thats ok - you really do not want to let consumer of your api to even try to mutate your MyDTOComputedField anyway.
Its not possible to translate it to SQL
Maybe your custom resolver do something not really under your control / not doable in sql (= not doable in EF). For example, http call to another server. Then, its up to you to do it right within your business logic. Maybe add custom query argument. Maybe write your own implementation of [UseFiltering] (its not THAT hard - HotChocolate is open source with great licencing, so you can basically go and [ctrl] + [c] [ctrl] + [v] current implementation and add what you need to add.)
I can't advise you, i dont know your bussiness requirement for MyDTOComputedField.

Should I re-utilize my EF query method and if yes, how to do it

I am using EF to get data from my MySQL database.
Now, I have two tables, the first is customers and project_codes.
The table project_codes have a FK to customers named id_customers. This way I am able to query which projects belong to a customer.
On my DAL, I got the following:
public List<customer> SelectAll()
{
using (ubmmsEntities db = new ubmmsEntities())
{
var result = db.customers
.Include(p => p.project_codes)
.OrderBy(c=>c.customer_name)
.ToList();
return result;
}
}
That outputs to me all my customer and their respective project_codes.
Recently I needed to only get the project codes, so I created a DAL to get all the project codes from my database. But then I got myself thinking: "Should I have done that? Wouldn't be best to use my SelectAll() method and from it use Linq to fetch me the list of project_codes off all customers?
So this that was my first question. I mean, re-utilizing methods as much as possible is a good thing from a maintainability perspective, right?
The second question would be, how to get all the project_codes to a List? Doing it directly is easy, but I failed to achieve that using the SelectAll() as a base.
It worked alright if I had the customer_id using:
ddlProject.DataSource = CustomerList.Where(x => x.id.Equals(customer_id))
.Select(p => p.project_codes).FirstOrDefault();
That outputed me the project codes of that customer, but I tried different approaches (foreach, where within whhere and some others at random) but they either the syntax fail or don't output me a list with all the project_codes. So that is another reason for me going with a specific method to get me the project codes.
Even if "common sense" or best practices says it is a bad idea to re-utilize the method as mentioned above, I would like some directions on how to achieve a list of project_codes using the return of SelectAll()... never know when it can come in hand.
Let me know your thoughts.
There's a trade-off here; you are either iterating a larger collection (and doing selects, etc) on an in-memory collection, or iterating a smaller collection but having to go to a database to do it.
You will need to profile your setup to determine which is faster, but its entirely possible that the in-memory approach will be better (though stale if your data could have changed!).
To get all the project_codes, you should just need:
List<customer> customers; //Fill from DAL
List<project_code> allProjects = customers.SelectMany(c => c.project_codes).ToList();
Note that I used SelectMany to flatten the hierarchy of collections, I don't think SelectAll is actually a LINQ method.

Mark some child entities as "Never Load" in LINQ to Entities query

TL;DR
I want to write a query in LINQ to Entities and tell it that I'll never load the child entities of an entity. How do I do that without projecting?
Eg,
return (from a in this.Db.Assets
join at in this.Db.AssetTypes on a.AssetTypeId equals at.AssetTypeId
join ast in this.Db.AssetStatuses on a.AssetStatusId equals ast.AssetStatusId
select new {
a = a,
typeDesc = at.AssetTypeDesc,
statusDesc = ast.AssetStatusDesc
}).ToList().Select(anon => new AssetViewModel(anon.a, anon.typeDesc, anon.statusDesc)).ToList();
I want the entity called Asset pulled into a on the anonymous type I'm defining, and when I call ToList(), I don't want the Assets' children, Status and Type, to lazy load.
EDIT: After some random Visual Studio autcomplete investigation, much of this can be accomplished by turning off lazy loading in the DbContext:
this.Db.Configuration.LazyLoadingEnabled = false;
Unfortunately, if your work with the query results does have a few child tables, even with LazyLoadingEnabled turned off, things may still "work" for some subset of them iff the data for those children has already been loaded earlier in this DbContext -- that is, if those children have already had their context cached -- which can make for some surprising and temporarily confusing results.
That is to say, I want to explicitly load some children at query time and completely sever any relationship to other child entities.
Best would be some way to actively load some entities and to ignore the rest. That is, I could call ToList() and not have to worry about throwing off lots of db connections.
Context
I have a case where I'm hydrating a view model with the results of a LINQ to Entities query from an entity called Asset. The Asset table has a couple of child tables, Type and Status. Both Type and Status have Description fields, and my view model contains both descriptions in it. Let's pretend that's as complicated as this query gets.
So I'd like to pull everything from the Asset table joined to Type and Status in one database query, during which I pull the Type and Status descriptions. In other words, I don't want to lazy load that info.
WET (Woeful Entity reTranscription?)
What we're doing now, which does exactly what I want from a connection standpoint, is the usual .Select into the view model, with a tedious field matchup.
return (from a in this.Db.Assets
join at in this.Db.AssetTypes on a.AssetTypeId equals at.AssetTypeId
join ast in this.Db.AssetStatuses on a.AssetStatusId equals ast.AssetStatusId
select new AssetViewModel
{
AssetId = a.AssetId,
// *** LOTS of fields from Asset removed ***
AssetStatusDesc = ast.AssetStatusDesc,
AssetTypeDesc = at.AssetTypeDesc
}).ToList();
That's good in that the Status and Type child entities of Asset are never accessed, and there's no lazy load. The SQL is one join in one database hit for all the assets. Perfect.
The worry is all the repeated jive in // *** LOTS of fields from Asset removed ***. Currently, we've got that projection in every freakin query, which obviously isn't DRY. And it means that when the Asset table changes, it's rare that the new field is included in every projection (because humans), which stinks.
I don't see a quick way around the query, btw. If I want to do it in a single query, I have to have the joins. I could add wheres to it in separate methods, but I'm not sure how I'd skip the projection each time. Or I could add joins to the query in cascading methods, but then my projection is still "repository bound", which isn't best case if I'm using these sorts of queries elsewhere. But I'm betting I'm stupiding something here.
Dumb
When I tried adding a cast to my view model from asset and changing to something like this, which is beautiful from a code standpoint, though I get bitten by lazy loading -- two extra database hits per Asset, one for Status and one for Type.
return (from a in this.Db.Assets
select a).ToList().Select(asset => (AssetViewModel)asset).ToList();
Just as we would expect, since I'm using lines like...
AssetTypeDesc = a.AssetType.AssetTypeDesc,
... inside of the casting code. So that was dumb. Concise, reusable, but dumb. This is why we hate folks who use ORMs without checking the SQL. ;^)
Overly clever, sorta
But then I tried getting too clever, with a new constructor for the view model that took the asset entity & the two description values as strings, which ended up with the same lazy load issue (because, duh, the first ToList() before selecting the anonymous objects means we don't know how the Assets are going to be used, and we're stuck pulling back everything to be safe (I assume)).
//Use anon type to skirt "Only parameterless constructors
//and initializers are supported in LINQ to Entities,"
//issue.
return (from a in this.Db.Assets
join at in this.Db.AssetTypes on a.AssetTypeId equals at.AssetTypeId
join ast in this.Db.AssetStatuses on a.AssetStatusId equals ast.AssetStatusId
select new {
a = a,
typeDesc = at.AssetTypeDesc,
statusDesc = ast.AssetStatusDesc
}).ToList().Select(anon => new AssetViewModel(anon.a, anon.typeDesc, anon.statusDesc)).ToList();
If only there was some way to say, "cast these anonymous objects to a List, but don't lazy load the Asset's children while you're doing it." <<< That's my question, natch.
I've read some about DataLoadOptions.LoadWith(), which probably provides an okay solution, and I might end up just doing that, but that's not precisely what I'm asking. I think that's a global-esque setting (? I think just for the life of the data context, which should be the single controller interaction), which I might not necessarily want to set. I may also want ObjectTrackingEnabled = false, but I'm not grokking yet.
I also don't want to use an automapper.
Painfully, after some random Visual Studio autcomplete investigation, this might be as easy as turning off lazy loading in your DbContext:
this.Db.Configuration.LazyLoadingEnabled = false;
The wacky thing is that if your work with the query results does have a few child tables, even with LazyLoadingEnabled turned off, things may still "work" for some subset of them iff the data for those children has already been loaded earlier in this DbContext -- that is, if those children have already had their context cached -- which can make for some surprising and temporarily confusing results.
Better would be to be able to cherry pick what children are "lazy-loading eligible".
I may need to update the question to make it cover this variation of the original question.

Data binding issues

I have started using the Entity Framework quite recently and think it is very good but I am a bit confused over a couple of things.
I'm writing a winforms-application where you are presented a list of persons in a list and if you click a particular person more information about that person appears in textboxes. Nothing fancy so far, but you are supposed to be able to edit the data about the person so data binding is nice here.
It works just fine but I am bit confused about what is the correct way to do it. First I did like this:
var query = _context.Person.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = query.ToList();
Then I read a bit and tried:
var local = _context.Person.Local;
IEnumerable<Customer> enumerable = local.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = enumerable.ToList();
This one seems to work fine as well.
Then I saw that someone suggested something like:
_context.Person.Load();
this.personBindingSource.DataSource = _context.Person.Local.ToBindingList();
I am a bit confused right now on what approach is the correct one and what is the difference between these three? Anyone can help me?
Depends on What You Want to do
I honestly never liked getting this answer because it seems to be the answer to everything but in this case it is really the only answer I can give
What the Local Property Does
Local gives you a reference to the current elements being tracked by the data context that hasn't been marked by delete, essentially you are asking for the current object that have already been loaded into memory by the context, you can read more about it here DbSet(TEntity).Local Property
What The Load Method Does
The Load method eagerly loads the targeted context, you can read more about it here.
DbExtensions.Load Method
What ToBindingList Does
Basically this is creating a two way binding between whatever entity you have created and the UI when you use a collection created using this method. That is that any changes in the UI should be automatically reflected in the related entities within this collection. You can read more about it using the following links
BindingList(T) Class DbExtensions.ToBindingList()
What Each Of Your Examples Do
First Example
var query = _context.Person.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = query.ToList();
Under the covers the following is going on
Creating a Query to be processed by the server with your linq expressionGetting the content from the database and creating a list around it
Here you are grabbing any of the people with Id of chosen Id from the database and then loading them into your application and creating a list
Second Example
var local = _context.Person.Local;
IEnumerable<Customer> enumerable = local.Where(c => c.Id == chosenId);
this.personBindingSource.DataSource = enumerable.ToList();
Under the covers the following is going on
Getting all of the currently tracked objects by the context object that have been hit by a query but have not been marked as deletedgetting all of the elements in local memory that have the Id chosen Id
Here you are grabbing any people that have already been loaded into the context this is not going to get all of your persisted data items, you must have hit them in other queries
Third Example
_context.Person.Load();
this.personBindingSource.DataSource = _context.Person.Local.ToBindingList();
Under the covers the following is going on
You are loading all of the people into local memoryYou create binding list (allows two way data binding between the objects and the UI) You bind the list to the UI element personBindingSource
Unless you want to load all of the items into memory this is probably not what you want to do, if the dataset ever grows large enough it will slow your program down and could possibly cause it to not work correctly (unlikely with case of person in most scenarios but is possible)
When You Should Use Them
FirstWhen you want to just get the data that matches your query into local memory and don't need a link between UI and the entitiesSecondWhen you have already run a query in the context and need to use it someplace else but don't need to rerun the query since it is currently in memoryThirdWhen you want to load all of the elements of an entity set into memory and create a two way databinding between them and a control
The only difference between the first and second example you have there is one is deferred and one is not. The third one is a little different as it creates a two way binding between your database and control datasource, effectively making the datacontext track changes to the list for you (add, & deletes).
In all your examples, so long as you keep you datacontext open changes to the objects themselves would be tracked.
As far as which way is correct, that depends on your application. Pick the approach that works best for you based on what you are trying to accomplish

sorting on related field in llblgen 2.6

I inherited an application that uses llblgen 2.6. I have a PersonAppointmentType entity that has a AppointmentType property (n:1 relation). Now I want to sort a collection of PersonAppointmentTypes on the name of the AppointmentType. I tried this so far in the Page_Load:
if (!Page.IsPostBack)
{
var p = new PrefetchPath(EntityType.PersonAppointmentTypeEntity);
p.Add(PersonAppointmentTypeEntity.PrefetchPathAppointmentType);
dsItems.PrefetchPathToUse = p;
// dsItems.SorterToUse = new SortExpression(new SortClause(PersonAppointmentTypeFields.StartDate, SortOperator.Ascending)); // This works
dsItems.SorterToUse = new SortExpression(new SortClause(AppointmentTypeFields.Name, SortOperator.Ascending));
}
I'm probably just not getting it.
EDIT:
Phil put me on the right track, this works:
if (!Page.IsPostBack)
{
dsItems.RelationsToUse = new RelationCollection(PersonAppointmentTypeEntity.Relations.AppointmentTypeEntityUsingAppointmentTypeId);
dsItems.SorterToUse = new SortExpression(new SortClause(AppointmentTypeFields.Name, SortOperator.Ascending));
}
You'll need to share more code if you want an exact solution. You didn't post the code where you actually fetch the entity (or collection). This may not seem relevant but it (probably) is, as I'm guessing you are making a common mistake that people make with prefetch paths when they are first trying to sort or filter on a related entity.
You have a prefetch path from PersonAppointmentType (PAT) to AppointType (AT). This basically tells the framework to fetch PATs as one query, then after that query completes, to fetch ATs based on the results of the PAT query. LLBLGen takes care of all of this for you, and wires the objects together once the queries have completed.
What you are trying to do is sort the first query by the entity you are fetching in the second query. If you think in SQL terms, you need a join from PAT=>AT in the first query. To achieve this, you need to add a relation (join) via a RelationPredicateBucket and pass that as part of your fetch call.
It may seem counter-intuitive at first, but relations and prefetch paths are completely unrelated (although you can use them together). You may not even need the prefetch path at all; It may be that you ONLY need the relation and sort clause added to your fetch code (depending on whether you actually want the AT Entity in your graph, vs. the ability to sort by its fields).
There is a very good explanation of Prefetch Paths and how they were here:
http://www.llblgening.com/archive/2009/10/prefetchpaths-in-depth/
Post the remainder of your fetch code and I may be able to give you a more exact answer.

Categories

Resources