Dynamic Linq nested Select on collection property - c#

We are using GraphQL Dotnet in a dotnet core 5.0 API.
We have a relational data structure like this
Organisation --> (Zero or many) Employers --> (Zero or many) Departments.
In an attempt to optimise our database queries we are doing some parsing of the query and hoping to limit the fields returned by SQL using dynamic linq.
They have some pretty basic examples like this
var resultDynamic = context.Customers
.Select("new { City, CompanyName }")
.ToDynamicList();
So you can select Customers.City, and Customers.CompanyName to a new dynamic object list.
This is not far off what we want to do, however, some of the properties we want to query are nested in collections.
for instance the following graphQL query
organisations {
id,
name,
employers {
addressLine1
}
}
}
Should return a list of organisations with id and name populated, and their employers with only the addressline1 populated.
If our query is Database.Organisations.Include(x => x.Employers); graphql parses this fine, and we return the correct information. However, if you inspect the SQL, it's getting every field for organisation, and every field for employers, and GQL is basically trimming the data client side.
Using dynamic linq I can do this
Database.Organisations.Select("new Organisation {id, name}");
and it works great, returns the organisation with id and name populated, and those are the only SQL fields in the SQL select statement.
But as soon as I try to Select the addressLine1 field of employers I hit a bit of a brick wall.
I can do this
Database.Organisatons.Select("new Organisation {id, name, employers}");
and it will only select the required fields from organisations, but all employer fields are selected (and GQL does the trimming client side) but I can't figure out a way to do employers.addressLine1 so that the subquery in sql only selects the one field.
Trying things like Organisations.Select("new Organisation {id, name, employers.name}"); just results in
No property or field 'name' exists in type 'List`1'
Which is obviously because I Employers is a List property and I am basically trying to get that (List).Name ... which doesn't exist.
Ideally I want to do this in this string based select format as i can build that up by parsing the GraphQL query and then I could only select the fields required from the child objects, but I have no idea what the format of that would be and can't find any examples.

It should be possible with employers.Select(name).
Example in LinqPad:
void Main()
{
Rooms.Dump();
(Rooms.Where(r => r.Id == 3) as IQueryable).Select("new { name, RoomReservations.Select(CheckinDate) as CheckinDates }").Dump();
}
Results into:
And the generated SQL looks like (only the 'name' and the 'CheckinDate' are selected in the query):
-- Region Parameters
DECLARE #p0 Int = 3
-- EndRegion
SELECT [t0].[Name], [t1].[CheckinDate], (
SELECT COUNT(*)
FROM [Reservations] AS [t2]
WHERE [t2].[RoomId] = [t0].[Id]
) AS [value]
FROM [Rooms] AS [t0]
LEFT OUTER JOIN [Reservations] AS [t1] ON [t1].[RoomId] = [t0].[Id]
WHERE [t0].[Id] = #p0
ORDER BY [t0].[Id], [t1].[Id]
--> Note that I tried to make a GraphQL library which uses DynamicLinq, however not all logic is supported. You can take a look here:
https://github.com/StefH/GraphQL.EntityFrameworkCore.DynamicLinq

Related

How to use CTE in Linq C#?

I have designed a recursive SQL CTE expression that sorts a recordset according to its parent Id in a nested hierarchy. How can I execute use this CTE query in my EF6 data context?
I was expecting to find a way to define CTEs in linq statements.
For background, this previous post helped me to identify the CTE:
Order By SQL Query based on Value another Column.
For the purposes of this post I am using a single table in the EF context.
This data model class has been generated from the database using Entity Framework 6 (ADO.NET Entity Data Model)
public partial class Something
{
public int Id{ get; set; }
public string Name{ get; set; }
public string Address { get; set; }
public string Email { get; set; }
public string PhoneNumber { get; set; }
public System.DateTime Date { get; set; }
public int IdParent{ get; set; }
}
And this is the sql query that I want to execute or translate to Linq
with cte (Id, Name, Address, Email, PhoneNumber, Date, IdParent, sort) as
(
select Id, Name, Address, Email, PhoneNumber, Date, IdParent,
cast(right('0000' + cast(row_number() over (order by Id) as varchar(5)), 5) as varchar(1024))
from Something
where Id = IdParent
union all
select t.Id, t.Name, t.Address, t.Email, t.PhoneNumber, t.Date, t.IdParent,
cast(c.sort + right('0000' + cast(row_number() over (order by t.Id) as varchar(5)), 5) as varchar(1024))
from cte c
inner join Something t on c.Id = t.IdParent
where t.Id <> t.IdParent
)
select *
from cte
order by sort
Writing hierachial queries in Linq to SQL is always a mess, it can work in memory but it doesn't translate to efficient SQL queries, this is a good discussion on on SO about some hierarchial Linq techniques
There are a few options:
Don't use Linq at all and query from your CTE directly!
Convert your CTE to a View
Re-write the query so that you don't need the CTE
This is easier if you have a fixed or theoretical limit to the recursion.
Even if you don't want to limit it, if you review the data and find that the highest level of recursion is only 2 or 3, then you could support
How to use a CTE directly in EF 6
DbContext.Database.SqlQuery<TElement>(string sql, params object[] parameters)
Creates a raw SQL query that will return elements of the given type (TElement). The type can be any type that has properties that match the names of the columns returned from the query
Database.SqlQuery on MS Docs
Raw SQL Queries (EF6)
Execute Raw SQL Queries in Entity Framework 6
NOTE: Do NOT use select * for this type (or any) of query, explicitly define the fields that you expect in the output to avoid issues where your query has more columns available than the EF runtime is expecting.
Perhaps of equal importance, if you want or need to apply filtering to this record set, you should implement the filtering in the raw SQL string value. The entire query must be materialized into memory before EF Linq filtering expressions can be applied
.SqlQuery does support passing through parameters, which comes in handy for filter expressions ;)
string cteQuery = #"
with cte (Id, Name, Address, Email, PhoneNumber, Date, IdParent, sort) as
(
select Id, Name, Address, Email, PhoneNumber, Date, IdParent,
cast(right('0000' + cast(row_number() over (order by Id) as varchar(5)), 5) as varchar(1024))
from Something
where Id = IdParent
union all
select t.Id, t.Name, t.Address, t.Email, t.PhoneNumber, t.Date, t.IdParent,
cast(c.sort + right('0000' + cast(row_number() over (order by t.Id) as varchar(5)), 5) as varchar(1024))
from cte c
inner join Something t on c.Id = t.IdParent
where t.Id <> t.IdParent
)
select Id, Name, Address, Email, PhoneNumber, Date, IdParent
from cte
order by sort
";
using (var ctx = new MyDBEntities())
{
var list = ctx.Database
.SqlQuery<Something>(cteQuery)
.ToList();
}
Understanding how and when to use .SqlQuery for executing raw SQL comes in handy when you want to squeeze the most performance out of SQL without writing complex Linq statements.
This comes in handy if you move your CTE into a view or table valued function or a stored procedure, once the results have been materialized into the list in memory, you can treat these records like any other
Convert your CTE to a View
If you are generating your EF model from the database, then you could create a view from your CTE to generate the Something class, however this becomes a bit disconnected if you also want to perform CRUD operations against the same table, having two classes in the model that represent virtually the same structure is a bit redundant IMO, perfectly valid if you want to work that way though.
Views cannot have ORDER BY statements, so you take this statement out of your view definition, but you still include the sort column in the output so that you can sort the results in memory.
Converting your CTE to a view will have the same structure as your current Something class, however it will have an additional column called sort.
How to write the same query without CTE
As I alluded at the start, you can follow this post Hierarchical queries in LINQ to help process the data after bringing the entire list into memory. However in my answer to OPs orginal post, I highlighted how simple self joins on the table can be used to produce the same results, we can easily replicate the self join in EF.
Even when you want to support a theoretically infinitely recursive hierarchy the realty of many datasets is that there is an observable or practical limit to the number of levels. If you can identify that practical limit, and it is a small enough number, then it might be simpler from a C# / Linq perspective to mot bother with the CTE at all
Put it the other way around, ask yourself this question: "If I set a practical limit of X number of levels of recursion, how will that affect my users?"
Put 4 in for X, if the result is that users will not generally be affected, or this data scenario is not likely to occur then lets try it out.
If a limit of 4 is acceptable, then this is your Linq statement:
I've used fluent notation here to demonstrate the relationship to SQL
var list = from child in ctx.Somethings
join parent in ctx.Somethings on child.parentId equals parent.Id
join grandParent in ctx.Somethings on parent.parentId equals grandParent.Id
orderby grandParent.parentId, parent.parentId, child.parentId, child.Id
select child;
I would probably use short hand aliases for this query in production, but the naming convention makes the intended query quickly human relatable.
If you setup a foreign key in the database linking parentId to the Id of the same table, then the Linq side is much simpler
This should generate a navigation property to enable traversing the foreign key through linq, in the following example this property is called Parent
var list = ctx.Somethings
.OrderBy(x => x.Parent.Parent.ParentId)
.ThenBy(x => x.Parent.ParentId)
.ThenBy(x => x.ParentId)
.ThenBy(x => x.Id);
You can see in this way, if we can limit the recusion level, the Linq required for sorting based on the recursive parent is quite simple, and syntactically easy to extrapolate to the number of levels you need.
You could do this for any number of levels, but there is a point where performance might become an issue, or where the number of line of code to achieve this is more than using the SqlQuery option presented first.
I'd recommend you create a view using the SQL provided. Once you create a view, you can map the view to a C# DTO using Entity Framework. After that you can query the view using LINQ.
Don't forget to include the [sort] column in your DTO because you can't include (or at least shouldn't) the sort order in your view definition. You can sort the query using LINQ instead of SQL directly.
Use your CTE With A different ORM
One can call stored procedures from EF and there are a number of posts to that end which you can do... but I recommend that you do a hybrid EF and ADO.Net system which will take advantage of your specialized sql code.
ADO.Net can be used which you will have to write by hand... but there is an ado.net based ORM which uses the principle of returning JSON from SQL Server as models. This ORM can be installed side by side with EF.
It is the Nuget Package SQL-Json (which I am the author) which can use your CTE and provide data as an array of models for your code to use.
Steps
Have your final CTE output return a JSON data by adding for json auto;.
Run the sql and generate json. Take that json to create a C# model using any website which coverts JSON To C# classes. In this example let us call the model class CTEData.
Put your sql into a store procedure.
Include SQL-Json package into your project.
In the model created in step #2 inherit the base class JsonOrmModel.
In the model again add this override public override string GetStoredProcedureName => "[dbo].MyGreatCTE"; with your actual sproc created in step #3.
Get the models:
var connectionStr = #"Data Source=.\Jabberwocky;Initial Catalog=WideWorldImporters";
var jdb = new JsonOrmDatabase(connectionStr);
List<CTEData> ctes = jdb.Get<CTEData>();
Then you can use your cte data as needed.
On the project page it shows how to do what I described with a basic POCO model at SQL-Json.
LINQ for SQL CTE
Here is an approach that applies to many scenarios where LINQ shouldn't/can't/won't/will never produce the SQL you need. This example executes a raw SQL CTE driven by LINQ logic to return the primary key (PK) value's ordinal row number regardless of basic sorts and filters on the entity/table.
The goal is to apply the same constraints to differing requirements. One requirement is the PK row's position w/in those constraints. Another requirement may be the count of rows that satisfy those constraints, etc. Those statistics need to be based on a common constraint broker.
Here, an IQueryable, under the purview of an open DbContext, applies those constraints and is the constraint broker. An alternate approach outside of any DbContext purview is to build expression trees as the constraint broker and return them for evaluation once back under the DbContext umbrella. The shortcut to that is https://github.com/dbelmont/ExpressionBuilder
LINQ could not express Structured Query Language (SQL) Common Table Expressions (CTE) in previous .NET versions. LINQ still can't do that. But...
Hello .NET 5 and my new girlfriend, IQueryable.ToQueryString(). She's the beautiful but potentially lethal kind. Regardless, she gives me all the target row numbers I could ever want.
But, I digress...
/// <summary>
/// Get the ordinal row number of a given primary key value within filter and sort constraints
/// </summary>
/// <param name="TargetCustomerId">The PK value to find across a sorted and filtered record set</param>
/// <returns>The ordinal row number (where the 1st filtered & sorted row is #1 - NOT zero), amongst all other filtered and sorted rows, for further processing - like conversion to page number per a rows-per-page value</returns>
/// <remarks>Doesn't really support fancy ORDER BY clauses here</remarks>
public virtual async Task<int> GetRowNumber(int TargetCustomerId)
{
int rowNumber = -1;
using (MyDbContext context = new MyDbContext())
{
// Always require a record order for row number CTEs
string orderBy = "LastName, FirstName";
// Create a query with a simplistic SELECT but all required Where() criteria
IQueryable<Customer> qrbl = context.Customer
// .Includes are not necessary for filtered row count or the row number CTE
.Include(c => c.SalesTerritory)
.ThenInclude(sr => sr.SalesRegion)
.Where(c => c.AnnualIncome > 30000 && c.SalesTerritory.SalesRegion.SalesRegionName == "South")
.Select(c => c )
;
// The query doesn't necessarily need to be executed...
// ...but for pagination, the filtered row count is valuable for UI stuff - like a "page x of n" pagination control, accurate row sliders or scroll bars, etc.
// int custCount = Convert.ToInt32(await qrbl.CountAsync());
// Extract LINQ's rendered SQL
string sqlCustomer = qrbl.ToQueryString();
// Remove the 1st/outer "SELECT" clause from that extracted sql
int posFrom = sqlCustomer.IndexOf("FROM [schemaname].[Customer] ");
string cteFrom = sqlCustomer.Substring(posFrom);
/*
If you must get a row number from a more complex query, where LINQ nests SELECTs, this approach might be more appropriate.
string[] clauses = sqlCustomer.Split("\r\n", StringSplitOptions.TrimEntries);
int posFrom = clauses
.Select((clause, index) => new { Clause = clause, Index = index })
.First(ci => ci.Clause.StartsWith("FROM "))
.Index
;
string cteFrom = string.Join("\r\n", clauses, posFrom, clauses.Length - posFrom);
*/
// As always w/ all raw sql, prohibit sql injection, etc.
string sqlCte = "WITH cte AS\r\n"
+ $"\t(SELECT [CustomerId], ROW_NUMBER() OVER(ORDER BY {orderBy}) AS RowNumber {cteFrom})\r\n"
+ $"SELECT #RowNumber = RowNumber FROM cte WHERE [CustomerId] = {TargetCustomerId}"
;
SqlParameter paramRow = new SqlParameter("RowNumber", System.Data.SqlDbType.Int);
paramRow.Direction = System.Data.ParameterDirection.Output;
int rows = await context.Database.ExecuteSqlRawAsync(sqlCte, paramRow).ConfigureAwait(false);
if (paramRow.Value != null)
{
rowNumber = (int)paramRow.Value;
}
}
return rowNumber;
}

Does the order of OrderBy, Select and Where clauses in Linq-to-entities matter

Suppose I have a table of students with tons of columns. I want to the EF equivalent of
SELECT id,lastname,firstname
FROM students
WHERE coursename='Eurasian Nomads'
ORDER BY lastname,firstname
I just want a subset of the full Student model so I made a view model
public class StudentView{
public int ID{get;set;}
public string LastName{get;set;}
public string FirstName{get;set;}
}
and this EF code seems to work:
List<StudentView> students=context.Students
.Where(s=>s.CourseName=="Eurasian Nomads")
.OrderBy(s=>s.LastName)
.ThenBy(s=>s.FirstName)
.Select(s=>new StudentView(){ID=s.ID,LastName=s.LastName,FirstName=s.FirstName})
.ToList();
But my question is does the order of these clauses matter at all, and if so, what sort of rules should I follow for best performance?
For example, this also seems to work:
List<StudentView> students=context.Students
.Select(s=>new StudentView(){ID=s.ID,LastName=s.LastName,FirstName=s.FirstName})
.OrderBy(s=>s.LastName)
.ThenBy(s=>s.FirstName)
.Where(s=>s.CourseName=="Eurasian Nomads")
.ToList();
The order in which you create your query before it's executed against the server is not relevant in most cases.
Actually one of the advantages is to be able of gradually create the query by concatenating where, order by, and other clauses.
But there are sometimes where the order can affect the generated sql.
Take the samples you provided. They both compile correctly, but the second does not actually get executed. If you try to run this query against an EF database you will get an NotSupportedException:
System.NotSupportedException: The specified type member 'CourseName' is not supported in LINQ to Entities.
The key here is that you are trying to filter the query by the CourseName property in the view model (StudentView) and not the property of the entity.
So you get this error.
In the case of the first query, it correctly generates this sql:
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[LastName] AS [LastName],
[Extent1].[FirstName] AS [FirstName]
FROM [dbo].[Students] AS [Extent1]
WHERE N'Eurasian Nomads' = [Extent1].[CourseName]
ORDER BY [Extent1].[LastName] ASC, [Extent1].[FirstName] ASC
So, as you can see the order is critical sometimes.

Single search box with multiple keywords

Can someone please share with me a good approach on how to build a query that uses one text box with multiple keywords that pick columns on several DB tables. Please see attached screen shot.
Requirement
I will need to define a format rule so the user "must" enter an input search with the following format: [category], [suburb] [postcode]. The code behind logic (web API) can then parse this input (this is where my search query will be parsed).
If you expect unified output then you can use UNION
INSERT INTO #resultTable
SELECT serviceid
FROM (
SELECT DISTINCT serviceid
FROM addresses s
WHERE s.suburb = #criteria
OR s.postal = #criteria
UNION
SELECT DISTINCT serviceid
FROM categories c
WHERE c.categotyName = #criteria
)
SELECT *
FROM services s INNER JOIN #resultTable ON ...

Using the SQL "IsNull()" command in NHibernate criteria in C#

I need to be able to use with NHibernate criteria, the SQL's IsNull() function in C#.NET. I don't need to use it with LINQ.
Meaning that Table1 has the following columns:
Name | Description
Table2 has the following columns:
OriginalDescription | TranslatedDescription
And Table1.Description = Table2.OriginalDescription.
How would I write the following SQL statement with NHibernate criteria:
SELECT Table1.Model, IsNull(Table2.TranslatedDescription, Table1.Description)
FROM Table1
LEFT JOIN Table2 ON Table2.OriginalDescription = Table1.Description
The SQL statement above will give me the Names, and TranslatedDescriptions if the TranslatedDescriptions exist, otherwise it will return the Descriptions, for the records.
There cannot be duplicates of OriginalDescription in the Table2.
The solution of the ISNULL could be expressed like this:
// here is the criteria of the "Entity1" and the join to the "Entity2"
var criteria = session.CreateCriteria("Entity1", "table1");
criteria.CreateAlias("Entity2", "table2");
// here we drive the SELECT clause
criteria.SetProjection(
Projections.ProjectionList()
.Add(Projections.Property("Model"))
.Add(Projections.SqlFunction("COALESCE", NHibernateUtil.String
, Projections.Property("table2.TranslatedDescription")
, Projections.Property("table1.Description")
))
);
// just a list of object arrays
var list = criteria.List<object[]>();
So, what we do here, is a call of the SqlFunction. In this case one of the out-of-the-box mapped in many Dialects coming with NHibernate (but we can even extend the dialect with custom ones, an example how to: Nhibernate count distinct (based on multiple columns))
Must note, that the JOIN Clause is coming from the mapping. So this Table2.OriginalDescription = Table1.Description must come from a mapped relation many-to-one

OrderBy a Many To Many relationship with Entity Sql

I'm trying to better utilize the resources of the Entity Sql in the following scenario: I have a table Book which has a Many-To-Many relationship with the Author table. Each book may have from 0 to N authors. I would like to sort the books by the first author name, ie the first record found in this relationship (or null when no authors are linked to a book).
With T-SQL it can be done without difficulty:
SELECT
b.*
FROM
Book AS b
JOIN BookAuthor AS ba ON b.BookId = ba.BookId
JOIN Author AS a ON ba.AuthorId = a.AuthorId
ORDER BY
a.AuthorName;
But I cannot think of how to adapt my code bellow to achieve it. Indeed I don't know how to write something equivalent directly with Entity Sql too.
Entities e = new Entities();
var books = e.Books;
var query = books.Include("Authors");
if (sorting == null)
query = query.OrderBy("it.Title asc");
else
query = query.OrderBy("it.Authors.Name asc"); // This isn't it.
return query.Skip(paging.Skip).Take(paging.Take).ToList();
Could someone explain me how to modify my code to generate the Entity Sql for the desired result? Or even explain me how to write by hand a query using CreateQuery<Book>() to achieve it?
EDIT
Just to elucidate, I'll be working with a very large collection of books (around 100k). Sorting them in memory would be very impactful on the performance. I wish the answers would focus on how to generate the desired ordering using Entity Sql, so the orderby will happens on the database.
The OrderBy method expects you to give it a lambda expression (well, actually a Func delegate, but most people would use lambdas to make them) that can be run to select the field to sort by. Also, OrderBy always orders ascending; if you want descending order there is an OrderByDescending method.
var query = books
.Include("Authors")
.OrderBy(book => book.Authors.Any()
? book.Authors.FirstOrDefault().Name
: string.Empty);
This is basically telling the OrderBy method: "for each book in the sequence, if there are any authors, select the first one's name as my sort key; otherwise, select the empty string. Then return me the books sorted by the sort key."
You could put anything in place of the string.Empty, including for example book.Title or any other property of the book to use in place of the last name for sorting.
EDIT from comments:
As long as the sorting behavior you ask for isn't too complex, the Entity Framework's query provider can usually figure out how to turn it into SQL. It will try really, really hard to do that, and if it can't you'll get a query error. The only time the sorting would be done in client-side objects is if you forced the query to run (e.g. .AsEnumerable()) before the OrderBy was called.
In this case, the EF outputs a select statement that includes the following calculated field:
CASE WHEN ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[BookAuthor] AS [Extent4]
WHERE [Extent1].[Id] = [Extent4].[Books_Id]
)) THEN [Limit1].[Name] ELSE #p__linq__0 END AS [C1],
Then orders by that.
#p__linq__0 is a parameter, passed in as string.Empty, so you can see it converted the lambda expression into SQL pretty directly. Extent and Limit are just aliases used in the generated SQL for the joined tables etc. Extent1 is [Books] and Limit1 is:
SELECT TOP (1) -- Field list goes here.
FROM [dbo].[BookAuthor] AS [Extent2]
INNER JOIN [dbo].[Authors] AS [Extent3] ON [Extent3].[Id] = [Extent2].[Authors_Id]
WHERE [Extent1].[Id] = [Extent2].[Books_Id]
If you don't care where the sorting is happening (i.e. SQL vs In Code), you can retrieve your result set, and sort it using your own sorting code after the query results have been returned. In my experience, getting specialized sorting like this to work with Entity Framework can be very difficult, frustrating and time consuming.

Categories

Resources