How can I implement Query Interception in a LINQ to Entities query? (c#) - c#

I'm trying to implement encrypted columns in EF4 and using the CTP5 features to allow simple use of POCO's to query the database. Sorry that this is a lot of words, but I hope the below gives enough to explain the need and the problem!
So, bit of background, and my progress so far:
The intention is that if you query the tables without using our DAL then the data is rubbish, but I don't want the developers to worry about if/when/how the data is encrypted.
For simplicity, at this stage I'm working on the assumption any string column will be encrypted.
Now, I have successfully implemented this for returning the data using the Objectmaterialized event, and for data commits using the SavingChanges event.
So given the following class:
public class Thing
{
public int ID { get; set; }
[Required]
public string Name { get; set; }
public DateTime Date { get; set; }
public string OtherString { get; set; }
}
The below query returns all the required values and the POCO materialized has clear data in it.
var things = from t in myDbContext.Things
select t;
where myDbContext.Things is a DbSet<Thing>
Likewise, passing an instance of Thing to Things.Add()
(with clear string data in the Name and/or OtherString values)
and then calling myDbContext.SaveChanges() encrypts the strings before it gets to the data store.
Now, the problem I have is in this query:
var things = from t in myDbContext.Things
where t.Name == "Hairbrush"
select t;
This results in the unencrypted value being compared to the encrypted value in the DB. Obviously I don't want to get all the records from the database, materialize them, and then filter the results based on any supplied Where clause... so what I need to do is: intercept that query and rewrite it by encrypting any strings in the Where clause.
So I've looked at:
writing a query provider, but that doesn't seem like the right solution... (is it?)
writing my own IQueryable wrapper for the DbSet which will capture the expression, run over it using an expression tree visitor and then forward the new expression to the DbSet...
Attempts at both have left me somewhat lost! I prefer the second solution i think since it feels a bit neater, and is probably clearer to other developers in future. But I'm happy to go with either or another better option!!
The main thing I am struggling with is when/how the LINQ expression is applied to the object... I think i've got myself a bit confused as to where the expression executes in the IQueryable object thus I'm not sure which method I need to implement in my wrapper to then grab and manipulate the expression being passed in...
I'm sure I'm missing something fairly obvious here and I'm waiting for that light bulb moment... but its not coming!!
Any help will be very gratefully received!

Thought I'd let you know what my final solution was.
In the end I have gone a wrapper class which implements a Where method, but without going to the extent of implementing IQueryable entirely. LINQ will still execute against the class (at least to the extent that I want/need it to) and will call the Where method with the expression from the LINQ.
I then traverse this ExpressionTree and replace my strings with encrypted values before forwarding the new expressiontree to the internal DbSet. and then returning the result.
Its pretty crude, and has its limitation, but works for our particular circumstance without problem.
Thanks,
Ben

you should use the QueryInterceptor attribute, search here in SO or in google and you find examples on how to use it.
a snippet:
[QueryInterceptor("Orders")]
public Expression<Func<Order, bool>> FilterOrders()
{
return o => o.Customer.Name == /* Current principal name. */;
}
// Insures that the user accessing the customer(s) has the appropriate
// rights as defined in the QueryRules object to access the customer
// resource(s).
[QueryInterceptor ("Customers")]
public Expression<Func<Customer, bool>> FilterCustomers()
{
return c => c.Name == /* Current principal name. */ &&
this.CurrentDataSource.QueryRules.Contains(
rule => rule.Name == c.Name &&
rule.CustomerAllowedToQuery == true
);
}

You can use David Fowler's Query Interceptor:
https://github.com/davidfowl/QueryInterceptor
One example of its use:
IQueryable q = ...;
IQueryable modifed = q.InterceptWith(new MyInterceptor());
And on class MyInterceptor:
protected override Expression VisitBinary(BinaryExpression node) {
if (node.NodeType == ExpressionType.Equal) {
// Change == to !=
return Expression.NotEqual(node.Left, node.Right);
}
return base.VisitBinary(node);
}

Related

Translate/Separate IQueryable expressions?

consider the following scenario:
public class DBEntry {
public string Id;
}
public class ComputedEntry {
public string Id;
public int ComputedIndex;
}
IQueryable<DBEntry> databaseQueryable; // Somewhere hidden behind the API
IQueryable<ComputedEntry> entryQueryable; // Usable with the API
Let's assume each DBEntry has a unique Id and not much else. A ComputedEntry has a 1:n relationship with DBEntry, meaning that a DBEntrycan be expanded into more than a single ComputedEntryduring execution of the query.
Now, I am trying to query entryQueryable to get a range of computed indices, e.g:
entryQueryable.Where(dto => dto.ComputedIndex < 10 && dto.Id == "some-id");
What I'm looking for is a way of separating the given query expression to only push down the relevant parts of the query to the databaseQueryable. In the example above something like this should happen (probably in the implementation of IQueryableProvider.Execute when using the entryQueryable):
var results = databaseQueryable.Where(e => e.Id == "some-id").ToList();
int i = 0;
return results.Select(e => new ComputedEntry(e.Id, i++));
So basically I'd like the query to be separated and the relevant/compatible parts should be pushed down to the databaseQueryable.
The obvious question would be: How should I approach this? I tried to figure out a way of separating the expression with an ExpressionVisitor, but haven't been very successful here and it seems like this is a rather complex task.
Any ideas? Maybe there is an already existing method of optimizing/translating the query I am not aware of? I have looked through the documentation but couldn't find anything useful here.
Many thanks for your suggestions!

Getting all dates between two dates using datepickers and Entity Framework 6

I have two datetime pickers on my form. I want a function that will return all datetimes from a specific table (which are values of a specific column) between those two dates.
My method looks like this:
public DateTime[] GetAllArchiveDates(string username = null)
{
var result = new DateTime[0];
if (username != null)
{
result = this._context.archive.OrderBy(s => s.IssuingDate).Where(s => s.insertedBy == username).Select(s => s.issuing_date).Distinct().ToArray();
}
else
{
result = this._context.archive.OrderBy(s => s.IssuingDate).Select(s => s.issuing_date).Distinct().ToArray();
}
return result;
}
But I am getting this error:
System.NotSupportedException: 'The specified type member 'IssuingDate' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.'
How to do this?
The cause of your error message
You should be aware about the differences between IEnumerable and IQueryable.
An object of a class that implements IEnumerable holds everything to enumerate over the sequence of items it represents. You can ask for the first item of the sequence, and once you've got one, you can ask for the next item, until there are no more items.
On the other hand, an object of a class that implements IQueryable holds everything to ask another process to provide data to create an IEnumerable sequence. To do this, it holds an Expression and a Provider.
The Expression is a generic representation of what kind of IEnumerable must be created once you start enumerating the IQueryable.
The Provider knows who must execute the query, and it knows how to translate the Expression into a format that the executor understands, for instance SQL.
There are two kinds of LINQ statements. Those that use deferred execution, and those that don't. The deferred functions can be recognized, because they return IQueryable<TResult> (or IEnumerable). Examples are Where, Select, GroupBy, etc.
The non-deferred functions return a TResult: ToList, ToDictionary, FirstOrDefault, Max.
As long as you concatenate deferred LINQ functions, the query is not executed, only the Expression is changed. Once you start enumerating, either explicitly using GetEnumerator and MoveNext, or implicitly using foreach, ToList, Max, etc, the Expression is sent to the Provider who will translate it to SQL and execute the query. The result is represented as an IEnumerable, on which the GetEnumerator is performed.
What has this to do with my question?
Because the Expression must be translated into SQL, it can't hold anything that you invented. After all, SQL doesn't know your functions. In fact, there are a lot of standard functions that can't be used in an IQueryable. See Supported and unsupported LINQ functions
Alas you forgot to give us the archive class definition, but I think that it is not a POCO: It contains functions and properties that do more than just get / set. I think that IssuingDate is not just get / set.
For IQueryables you should keep your classes simple: use only {get; set;} during your query, nothing more. Other functions can be called after you've materialized your IQueryable into something IEnumerable which is to be executed within your local process
Back to your question
So you have a database with a table Archive with at least columns IssuingDate and InsertedBy. It seems that InsertedBy is just a string. It could be a foreign key to a table with users. This won't influence the answer very much.
Following the entity framework code first conventions this leads to the following classes
class Archive
{
public int Id {get; set;}
public DateTime IssuingDate {get; set;}
public string InsertedBy {get; set;}
...
}
public class MyDbContext : DbContext
{
public DbSet<Archive> Archives {get; set;}
}
By the way, is there a proper reason you deviate so often from Microsoft standards about naming identifiers, especially pluralization and camel casing?
Anyway, your requirement
I have two datetime pickers on my form. I want a function that will return all datetimes from a specific table (which are values of a specific column) between those two dates.
Your code seems to do a lot more, but let's first write an extension function that meets your requirement. I'll write it as an extension method of your archive class. This will keep your archive class simple (only {get; set;}), yet it adds functionality to the class. Writing it as an extension function also enables you to use these functions as if they were any other LINQ function. See Extension methods demystified
public static IQueryable<Archive> BetweenDates(this IQueryable<Archive> archives,
DateTime startDate,
DateTime endDate)
{
return archives.Where(archive => startDate <= archive.IssuingDate
&& archive.IssuingDate <= endDate);
}
If I look at your code, you don't do anything of selecting archives between dates. You do something with a userName, ordering, select distinct... It is a bit strange that you first Order all your million archives, and then decide to keep only the ten archives that belong to userName, and if you have several same issuing dates you decide to remove the duplicates. Wouldn't it be more efficient to first limit the number of issuing dates before you start ordering them?
public static IQueryable<archive> ToIssuingDatesOfUser(this IQueryable<archive> archives,
string userName)
{
// first limit the number of archives, depdning on userName,
// then select the IssuingDate, remove duplicates, and finally Order
var archivesOfUser = (userName == null) ? archives :
archives.Where(archive => archive.InsertedBy == userName);
return archivesOfUser.Select(archive => archive.IssuingDate)
.Distinct()
.OrderBy(issuingDate => issuingDate);
}
Note: until now, I only created IQueryables. So only the Expression is changed, which is fairly efficient. The database is not communicated yet.
Example of usage:
Requirement: given a userName, a startDate and an endDate, give me the unique issuingDates of all archives that are issued by this user, in ascending order
public ICollection<string> GetIssuingDatesOfUserBetweenDates(string userName,
DateTime startDate,
DateTime endDate)
{
using (var dbContext = new MyDbContext(...))
{
return dbContext.Archives
.BetweenDates(startDate, endDate)
.ToIssuingDatesOfUser(userName)
.ToList();
}
}

LINQ to Entity Any() with related Object Collection

First, let me say that I've researched this problem and read the following stack overflow articles, but none of them really address this situation.
How can I use Linq to join between objects and entities?
inner join in linq to entities
Situation
I have two classes
public class Section{
public string SchoolId{get;set;}
public string CourseId{get;set;}
public string SectionId{get;set;}
}
public class RelatedItem{
public string SchoolId{get;set;}
public string CourseId{get;set;}
public string SectionId{get;set;}
//..
}
I have an array of Section coming from one source and is an actual collection of Objects.
RelatedItem I'm getting via a LINQ to Entities call against a DbContext.
My goal is to get all of the RelatedItems based on the Sections I have from the other source.
I'm writing a query like this
Section[] mySections = GetSections(); //Third Party Source
IQueryable<RelatedItem> relatedItems = DbContext.RelatedItems
.Where(r=>
mySections.Any(s=> s.SchoolId == r.SchoolId &&
s.CourseId == r.CourseId &&
s.SectionId == r.SectionId)
);
Problem
At runtime, I receive the following error
Unable to create a constant value of type
'ProjectNamespace.Section'. Only primitive types or
enumeration types are supported in this context.
I found a work around, but it involves doing the following, but it doesn't take advantage of any of my table indexes.
var sectionIds = sections.Select(s=>string.Concat(s.SchoolId, "|",s.CourseId, "|",s.SectionId));
IQueryable<RelatedItem> relatedItems = DbContext.RelatedItems
.Where(r=>
sectionIds.Contains(string.Concat(r.SchoolId, "|",r.CourseId, "|",r.SectionId))
);
This block of code works, and currently is pretty fast (but this is dev, and my record count is small). Aside from converting my related items to a collection in memory, does anyone have any other suggestions?
Try using Contains instead:
Section[] mySections = GetSections(); //Third Party Source
IQueryable<RelatedItem> relatedItems = DbContext.RelatedItems.Where(r=>
mySections.Select(s => s.SchoolId).Contains(r.SchoolId) &&
mySections.Select(s => s.CourseId).Contains(r.CourseId) &&
mySections.Select(s => s.SectionId).Contains(r.SectionId)
);
Contains should translate to WHERE IN clauses in SQL.
This won't work if using .NET 3.5 and LINQ to Entities, as it wasn't implemented in that version.
Proper way to solve this is to implement IEquitable. Here is an example on how to do it Does LINQ to Entities support IEquatable in a 'where' clause predicate?
One tip when implementing Equals() and GetHashCode() do not call any .NET methods (like getType()) only compare primitives SchoolId, CourseId, SectionId, it should get converted to expression tree and work just fine.

How to define anonymous method types to build dynamic queries with LINQ?

I'm busy with a LINQ to SQL project that basically creates multiple threads for each entity type in my database, which constantly queries information from the DB in a thread.
Here's a pseudo example:
streamer.DefineDataExpression<Contacts>(x => x.FirstName == "Bob");
while(true)
{
List<Contacts> MyContactsResult = streamer.ResultList;
// do whatever with MyContactsResult
}
The above code doesn't exist, but this is what I have so far for the 'streamer' class (it obviously doesn't work, but you can see what I'm trying to achieve above):
public void DefineExpression(System.Linq.Expressions.Expression<System.Func<T, bool>> expression)
{
using (var db = new LINQDataContext())
{
ResultList = db.GetTable<T>().Where(expression);
}
}
How do I go about creating a method like 'DefineExpression' that will allow me to query a LINQ type dynamically?
Why not use the Dynamic LINQ provider, as mentioned by Scott Guthrie. I think that would give you everything you are looking for, because you can define the query as a string. Therefore, you can more easily build a string representation of your query, and execute on the fly.

What's the point of a lambda expression?

After reading this article, I can't figure out why lambda expressions are ever used. To be fair, I don't think I have a proper understanding of what delegates and expression tree types are, but I don't understand why anyone would use a lambda expression instead of a declared function. Can someone enlighten me?
First: brevity and locality:
Which would you rather write, read and maintain? This:
var addresses = customers.Select(customer=>customer.Address);
or:
static private Address GetAddress(Customer customer)
{
return customer.Address;
}
... a thousand lines later ...
var addresses = customers.Select(GetAddress);
What's the point of cluttering up your program with hundreds or thousands of four-line functions when you could just put the code you need where you need it as a short expression?
Second: lambdas close over local scopes
Which would you rather read, write and maintain, this:
var currentCity = GetCurrentCity();
var addresses = customers.Where(c=>c.City == currentCity).Select(c=>c.Address);
or:
static private Address GetAddress(Customer customer)
{
return customer.Address;
}
private class CityGetter
{
public string currentCity;
public bool DoesCityMatch(Customer customer)
{
return customer.City == this.currentCity;
}
}
....
var currentCityGetter = new CityGetter();
currentCityGetter.currentCity = GetCurrentCity();
var addresses = customers.Where(currentCityGetter.DoesCityMatch).Select(GetAddress);
All that vexing code is written for you when you use a lambda.
Third: Query comprehensions are rewritten to lambdas for you
When you write:
var addresses = from customer in customers
where customer.City == currentCity
select customer.Address;
it is transformed into the lambda syntax for you. Many people find this syntax pleasant to read, but we need the lambda syntax in order to actually make it work.
Fourth: lambdas are optionally type-inferred
Notice that we don't have to give the type of "customer" in the query comprehension above, or in the lambda versions, but we do have to give the type of the formal parameter when declaring it as a static method. The compiler is smart about inferring the type of a lambda parameter from context. This makes your code less redundant and more clear.
Fifth: Lambdas can become expression trees
Suppose you want to ask a web server "send me the addresses of the customers that live in the current city." Do you want to (1) pull down a million customers from the web site and do the filtering on your client machine, or (2) send the web site an object that tells it "the query contains a filter on the current city and then a selection of the address"? Let the server do the work and send you only the result that match.
Expression trees allow the compiler to turn the lambda into code that can be transformed into another query format at runtime and sent to a server for processing. Little helper methods that run on the client do not.
The primary reason you'd use a lambda over a declared function is when you need to use a piece of local information in the delegate expression. For example
void Method(IEnumerable<Student> students, int age) {
var filtered = students.Where(s => s.Age == age);
...
}
Lambdas allow for the easy capture of local state to be used within the delegate expression. To do this manually requires a lot of work because you need to declare both a function and a containing type to hold the state. For example here's the above without a lambda
void Method(IEnumerable<Student> students, int age) {
var c = new Closure() { Age = age };
var filtered = students.Where(c.WhereDelegate);
...
}
class Closure {
public int age;
bool WhereDelegate(Student s) {
return s.Age == age;
}
}
Typing this out is tedious and error prone. Lambda expressions automate this process.
Let's leave expression trees out of the equation for the moment and pretend that lambdas are just a shorter way to write delegates.
This is still a big win in the realm of statically typed languages like C# because such languages require lots of code to be written in order to achieve relatively simple goals. Do you need to compare sort an array of strings by string length? You need to write a method for that. And you need to write a class to put the method into. And then good practice dictates that this class should be in its own source file. In any but the smallest project, all of this adds up. When we 're talking about small stuff, most people want a less verbose path to the goal and lambdas are about as terse as it can get.
Furthermore, lambdas can easily create closures (capture variables from the current scope and extend their lifetime). This isn't magic (the compiler does it by creating a hidden class and performing some other transformations that you can do yourself), but it's so much more convenient than the manual alternative.
And then there are expression trees: a way for you to write code and have the compiler transform this code into a data structure that can be parsed, modified and even compiled at runtime. This is an extremely powerful feature that opens the door to impressive functionality (which I definitely consider LINQ to be). And you get it "for free".
http://msdn.microsoft.com/en-us/magazine/cc163362.aspx
Great article on what lambdas are, and why you can/should use them.
Essentially, the lambda expression
provides a shorthand for the compiler
to emit methods and assign them to
delegates; this is all done for you.
The benefit you get with a lambda
expression that you don't get from a
delegate/function combination is that
the compiler performs automatic type
inference on the lambda arguments
They are heavily used with LINQ, actually LINQ would be pretty bad without it. You can do stuff like:
Database.Table.Where(t => t.Field ==
"Hello");
They make it easy to pass a simple piece of functionality to another function. For example, I may want to perform an arbitrary, small function on every item in a list (perhaps I want to square it, or take the square root, or so on). Rather than writing a new loop and function for each of these situations, I can write it once, and apply my arbitrary functionality defined later to each item.
Lambda makes code short and sweet. Consider the following two examples:
public class Student
{
public string Name { get; set; }
public float grade { get; set; }
public static void failed(List<Student> studentList, isFaild fail)
{
foreach (Student student in studentList)
{
if(fail(student))
{
Console.WriteLine("Sorry" + " "+student.Name + " "+ "you faild this exam!");
}
}
}
public delegate bool isFaild(Student myStudent);
class Program
{
static void Main(string[] args)
{
List<Student> studentsList = new List<Student>();
studentsList .Add(new Student { ID = 101, Name = "Rita", grade = 99 });
studentsList .Add(new Student { ID = 102, Name = "Mark", grade = 48 });
Student.failed(studentsList, std => std.grade < 60); // with Lamda
}
}
private static bool isFaildMethod(Student myStudent) // without Lambda
{
if (myStudent.grade < 60)
{
return true;
}
else
{
return false;
}
}

Categories

Resources