Scala collection-like SQL support as in LINQ

Scala collection-like SQL support as in LINQ - c#

As far as I understand the only thing LINQ supports, which Scala currently doesn't with its collection library, is the integration with a SQL Database.
As far as I understand LINQ can "accumulate" various operations and can give "the whole" statement to the database when queried to process it there, preventing that a simple SELECT first copies the whole table into data structures of the VM.
If I'm wrong, I would be happy to be corrected.
If not, what is necessary to support the same in Scala?
Wouldn't it possible to write a library which implements the collection interface, but doesn't have any data structures backing it but a String which gets assembled with following collection into the required Database statement?
Or am I completely wrong with my observations?

As the author of ScalaQuery, I don't have much to add to Stilgar's explanation. The part of LINQ which is missing in Scala is indeed the expression trees. That is the reason why ScalaQuery performs all its computations on Column and Table types instead of the basic types of those entities.
You declare a table as a Table object with a projection (tuple) of its columns, e.g.:
class User extends Table[(Int, String)] {
def id = column[Int]("id", O.PrimaryKey, O.AutoInc)
def name = column[String]("name")
def * = id ~ name
}
User.id and User.name are now of type Column[Int] and Column[String] respectively. All computations are performed in the Query monad (which is a more natural representation of database queries than the SQL statements that have to be created from it). Take the following query:
val q = for(u <- User if u.id < 5) yield u.name
After some implicit conversions and desugaring this translates to:
val q:Query[String] =
Query[User.type](User).filter(u => u.id < ConstColumn[Int](5)).map(u => u.name)
The filter and map methods do not have to inspect their arguments as expression trees in order to build the query, they just run them. As you can see from the types, what looks superficially like "u.id:Int < 5:Int" is actually "u.id:Column[Int] < u.id:Column[Int]". Running this expression results in a query AST like Operator.Relational("<", NamedColumn("user", "id"), ConstColumn(5)). Similarly, the "filter" and "map" methods of the Query monad do not actually perform filtering and mapping but instead build up an AST that describes these operations.
The QueryBuilder then uses this AST to construct the actual SQL statement for the database (with a DBMS-specific syntax).
An alternative approach has been taken by ScalaQL which uses a compiler plugin to work directly with expression trees, ensure that they only contain the language subset which is allowed in database queries, and construct the queries statically.

I should mention that Scala does have experimental support for expression trees. If you pass an anonymous function as an argument to a method expecting a parameter of type scala.reflect.Code[A], you get an AST.
scala> import scala.reflect.Code
import scala.reflect.Code
scala> def codeOf[A](code: Code[A]) = code
codeOf: [A](code:scala.reflect.Code[A])scala.reflect.Code[A]
scala> codeOf((x: Int) => x * x).tree
res8: scala.reflect.Tree=Function(List(LocalValue(NoSymbol,x,PrefixedType(ThisType(Class(scala)),Class(scala.Int)))),Apply(Select(Ident(LocalValue(NoSymbol,x,PrefixedType(ThisType(Class(scala)),Class(scala.Int)))),Method(scala.Int.$times,MethodType(List(LocalValue(NoSymbol,x$1,PrefixedType(ThisType(Class(scala)),Class(scala.Int)))),PrefixedType(ThisType(Class(scala)),Class(scala.Int))))),List(Ident(LocalValue(NoSymbol,x,PrefixedType(ThisType(Class(scala)),Class(scala.Int)))))))
This has been used in the bytecode generation library 'Mnemonics', which was presented by its author Johannes Rudolph at Scala Days 2010.

With LINQ the compiler checks to see if the lambda expression is compiled to IEnumerable or to IQueryable. The first works like Scala collections. The second compiles the expression to an expression tree (i.e. data structure). The power of LINQ is that the compiler itself can translate the lambdas to expression trees. You can write a library that builds expression trees with interface similar to what you have for collection but how are you goning to make the compiler build data structures (instead of JVM code) from lambdas?
That being said I am not sure what Scala provides in this respect. Maybe it is possible to build data structures out of lambdas in Scala but in any case I believe you need a similar feature in the compiler to build support for databases. Mind you that databases are not the only underlying data source that you can build providers for. There are numerous LINQ providers to stuff like Active Directory or the Ebay API for example.
Edit:
Why there cannot be just an API?
In order to make queries you do not only use the API methods (filter, Where, etc...) but you also use lambda expressions as arguments of these methods .Where(x => x > 3) (C# LINQ). The compilers translate the lambdas to bytecode. The API needs to build data structures (expression trees) so that you can translate the data structure to the underlying data source. Basically you need the compiler to do this for you.
Disclaimer 1:
Maybe (just maybe) there is some way to create proxy objects that execute the lambdas but overload the operators to produce data structures. This would result in slightly worse performance than the actual LINQ (runtime vs compile time). I am not sure if such a library is possible. Maybe the ScalaQuery library uses similar approach.
Disclaimer 2:
Maybe the Scala language actually can provide the lambdas as an inspectable objects so that you can retrieve the expression tree. This would make the lambda feature in Scala equivalent to the one in C#. Maybe the ScalaQuery library uses this hypothetical feature.
Edit 2:
I did a bit of digging. It seems like ScalaQuery uses the library approach and overloads a bunch of operators to produce the trees at runtime. I am not entirely sure about the details because I am not familiar with Scala terminology and have hard time reading the complex Scala code in the article:
http://szeiger.de/blog/2008/12/21/a-type-safe-database-query-dsl-for-scala/
Like every object which can be used in or returned from a query, a table is parameterized with the type of the values it represents. This is always a tuple of individual column types, in our case Integers and Strings (note the use of java.lang.Integer instead of Int; more about that later). In this respect SQuery (as I’ve named it for now) is closer to HaskellDB than to LINQ because Scala (like most languages) does not give you access to an expression’s AST at runtime. In LINQ you can write queries using the real types of the values and columns in your database and have the query expression’s AST translated to SQL at runtime. Without this option we have to use meta-objects like Table and Column to build our own AST from these.
Very cool library indeed. I hope in the future it gets the love it deserves and becomes real production ready tool.

You probably want something like http://scalaquery.org/. It does exactly what #Stilgar answer suggests, except it's only SQL.

check out http://squeryl.org/

Related

Is it possible to implement .Equals for a Guid data type with EF 6?

I don't want to use '==', I want to use .Equals, this will allow me to do some more complicated stuff with generics.
At bare minimum I want to be able to execute this query without getting a NotSupportedException. I'd like to implement .Equals to work with a Guid just like it works with primitive types such as int, bool, and string. I've done similar stuff with NHibernate for implementing custom methods to build queries, and am hoping to be able to do the same with EF.
var id = Guid.NewGuid();
_dbContext.MyEntityType.Single(x => x.Id.Equals(id));

Guid does implement an Equals method already. In this case it's EF that doesn't support transforming your specific usage into SQL, so unless you plan on writing your own query provider, you can't make it understand how to translate that code. All you can do not write expressions that it doesn't know how to turn into SQL, which in this case means writing == in your expression (if that's what EF knows how to translate) rather than Equals.

What am I using? (Linq/sql) [duplicate]

This question already has answers here:
Linq query or Lambda expression?
(5 answers)
Closed 7 years ago.
My project currently uses for following syntax for "linq"
var le = domainModel.EntityDataContext.Table_name_here.Query()
.WithFullReadCheck(domainModel.Security)
.Where(x => x.Metadata.AdvisorID == advisorId).FirstOrDefault();
It is said the above code is linq, being unaware about linq I decided to learn it. However, on https://msdn.microsoft.com/en-us/library/gg509017.aspx , things are pretty different.
what is being used in my code? Is it a version of linq? Is it something else?

What you are using is LINQ. There are 2 different notations you can use for writing LINQ - Lambda Syntax and Query Syntax
The difference is explained here: LINQ: Dot Notation vs Query Expression
There is more information on this on MSDN here: https://msdn.microsoft.com/en-us/library/bb308959.aspx
The MSDN articles starts explaining LINQ with SQL-like syntax ("query syntax") and then goes on to explain Lambda Expressions and Expression Trees ("lamba syntax").

That is extension method syntax for a query. The .Query().WithFullReadCheck() are much less common, but the rest .Where and .FirstOrDefault are pretty common query extensions. LINQ is a special syntax for expressing the same thing that can be done with chaining the query extension methods. Behind the scenes, they are identical.
See this question which gives an example of both LINQ and extension method syntax and has a good discussion of the differences between them: Extension methods syntax vs query syntax

Linq (language integrated query) is the language you use to query the data. You don't care where do they come from, that's the whole point of using it. You can use the same code to query arrays, collections, databases, xml files, directories, whatever... Linq translates your code in the background, so if you use it for instance to get the data from a database, it produces SQL to be sent to the database.
There are two syntax versions available:
1.) lambda syntax
2.) chainable methods
It doesn't matter which one you chose, just try to be consistent with it and use what makes you more comfortable / makes more sense in your situation.

LINQ is 'Language INtegrated Query'. In practical terms it means two things to most people:
The way that inline delegates are converted by the compiler into expression trees, not anonymous methods. That means that x => x.Metadata.AdvisorID == advisorId is not compiled into IL at compile time: instead it's compiled into code which builds an equivalent Expression object, which can be passed to a provider such as Entity Framework or LINQ to SQL, in order to generate a database query.
The other part it means to most people is the so-called "syntactic sugar", which effectively calls .Where(), .OrderBy() etc on your behalf:
from a in something
where a.Name == "Bob"
select a;
In both cases, the Queryable class in .NET provides extension methods for building a chain of Expression objects, for example .Where(), .OrderBy(), .Select() etc - these take an IQueryable (either untyped or with a generic parameter) and return another. They lightly wrap an Expression object at each point which represents the query as a whole.
Meaning:
someQueryable.Where(x => x.Id > 3).OrderBy(x => x.Date).Select(x => x.Ref)
Returns an object implementing IQueryable which holds an Expression which looks like:
Select(OrderBy(Where(someQueryable, x => x.Id > 3), x => x.Date), x => x.Ref)
... which is readable by LINQ providers to produce something like:
SELECT Ref FROM someTable WHERE Id > 3 ORDER BY Date
Finally, note that .Where(), .OrderBy() and the like are NOT limited to queryable/LINQ objects. They exist for IEnumerable (and IEnumerable<>), too - but this is not LINQ, and simply performs the operations at the instant that the method is called.

remove a condition from ef expression tree

I'm using entity framework to handle the database in a MVC application: I created a search engine for the application using several fields from start form, and setting privileges that comes from the user role, so the routine that create the select statement is growing, so sometimes I have this situation.
before i make a selection:
var orders = db.Orders.Where( ord => ord.Channel == 1 || ord.Channel == 2);
after in the source code, in some cases, type of users, filter combinations, and so on, I have to change this filter or remove this condition
orders = db.Orders.Where( ord => ord.Channel == 1 );
this is just a simple example, because doing this way i loose the other filters I have done in the Linq expression tree, and cause i have request to add features very often, it's very difficult to reorganize all the code to test all the prerequisite before to add conditions to the expression tree so i would like to know if is possible to remove a statement in the linq expression tree after has been added and before the tree is parsed and converted in a sql query
luca

Assuming I understand your question correctly, I believe the simplest possible approach is to defer the creation of the query until you have collected all the data needed to know what the query should look like, i.e. in this case it would mean you build the query only after you already know whether the condition actually needs to be applied.
If it is not possible for you to defer the creation of the query (it seems to me that is what you are saying), but you can still anticipate which conditions may need to be removed at a later point, I would consider introducing a Boolean sentinel in the query, e.g. based on your query snippet:
var orders = db.Orders.Where( ord =>
(isFirstConditionRelevant && ord.Channel == 1)
|| (isSecondConditionRelvant && ord.Channel == 2));
The variables 'isFirstConditionRelevant' and 'isSecondConditionRelevant' would be initially be set to true and would be captured by the query expression, but you could set to false at a later point (e.g. just before executing the query) if you needed the corresponding condition to have no effect. Notice that with this approach the SQL translation of the query will still contain the condition, but it will also contain a parameter for each sentinel that will short circuit the Boolean logic when the query is executed by the server.
Note also that any constants in your conditions (e.g. the '1' in 'ord.Channel == 1') will be translated to constant in the SQL. I would suggest using variables as this will introduce parameters in the SQL query that could increase the chances that the query plan can be reused on the server.
Another approach that could be useful is to take advantage of support for references to expression variables in the query, e.g. you could pass a variable of type Expression> to the Where clause and at a later point replace the value of the variable to the right predicate. If I remember correctly, LINQ to Entities supports using expression variables as predicate conditions over nested collections by applying the AsQueryable operator to the collection in the query. Otherwise you can take advantage of the support for AsExpandable in provided by the LINQ kit. You will find more information about this, as well as examples of using references to expression variables in the LINQ Kit home page: http://www.albahari.com/nutshell/linqkit.aspx.
The fourth and most complicated approach I would consider is to use my own visitor (e.g. derived from System.Linq.Expressions.ExpressionVisitor) to rewrite the LINQ expression tree (in this case to remove the condition from the predicate) before the query is executed. I don't have an expression visitor that does exactly that handy. Instead, I can provide a few pointers to articles that solve different parts of the problem:
This thread in StackOverflow describes how to use an expression visitor to do some query rewriting: Using a LINQ ExpressionVisitor to replace primitive parameters with property references in a lambda expression.
This great blog post by Alex James describes how to write an intercepting query provider that will call your expression rewriting visitor just be fore the query is executed: http://blogs.msdn.com/b/alexj/archive/2010/03/01/tip-55-how-to-extend-an-iqueryable-by-wrapping-it.aspx.
The following article describes a library that can be used to rewrite expressions based on rules: http://www.codeproject.com/Articles/24454/Modifying-LINQ-Expressions-with-Rewrite-Rules.
Hope this helps!

Error, method not supported by LINQ to Entities

Why I am getting this error:
The method 'Single' is not supported by LINQ to Entities. Consider using the method 'First' instead.
public ActionResult Details(int id)
Line 27: {
var result = (from d in _db.MovieSet
Line 29: where d.Id == id
Line 30: select d).Single();
//
//
}
Code compiles safe, but only breaks if call is made to the respective section. I am new to LINQ, therefore do not know which methods are for LINQtoSQL or LINQtoEntities. This means more errors! We cannot remember all methods this way.
My question is, if there are limitations to the methods applicable to certain types / scenarios, why do they appear in Intellisense?
EDIT: Any work-around / technique helpful to have an idea if one is supported ?

Microsoft has a complete list of supported and unsupported methods in Linq to Entities. That's where to go to find out this information.
You'll notice that the Single and SingleOrDefault methods are in fact listed as "not supported" in the section on Paging Methods.
As Jared pointed out, the compiler does not know at compile time which provider you are using, so it has no way to enforce compile-time safety of extension methods that the provider may or may not implement. You'll have to rely on the documentation instead.

In the case of LINQtoSQL / Entities, the queries are all broken down into expression trees which are then passed to the provider APIs. The providers cannot provide compile time information about the trees they do or do not support because there is no syntactic difference. The only choice is for them to provide runtime data.
For example once in expression tree form, both the Single and First appear as a MethodCallExpression instance.

Unfortunately, it's yet another indication of both the relative immaturity of EF and the Object Relational Impedance Mismatch.
Documentation is your friend if you choose to go this route.

Does Entity Framework/LINQ to SQL Data Binding use reflection?

Forgive me if this has been asked before; I did a search but couldn't find anything that specifically answered this question, but I'd be happy to be referred.
I'm taking a peek at both the Entity Framework and LINQ to SQL, and while I like the systems (and of course the LINQ integration) I'm a little skeptical on the data-binding aspect. I've taken query results and inspected them, and they don't appear to implement the standard .NET list-binding interfaces (IBindingList, and more importantly ITypedList), leading me to believe that binding them to a grid (or anything else) is going to use reflection to get/set my entity properties. This seems like a comparatively expensive operation, especially when all of the code is generated anyway and could implement the interfaces.
Does anyone have any insight into this? Is reflection used to get/set the value of the properties? If yes, does this have a performance impact that's noticeable, and does anyone have any idea why they went this route?
Edit
I'm actually concentrating on whether or not ITypedList is implemented somewhere along the way, as that's what has the capability to provide an explicit mechanism for defining and interacting with properties without resorting to reflection. While I didn't have a LINQ to SQL project up, I did inspect a simple Entity Framework entity query result, and it did not appear to implement ITypedList. Does anyone know if I'm just looking at something incorrectly, or if not why this is the case?
Edit 2
After accepting Marc's answer, I thought it might be helpful for others if I posted some simple code I used to seamlessly implement his solution. I put the following in the static constructor for my class:
public static ContextName()
{
foreach(Type type in System.Reflection.Assembly.GetExecutingAssembly()
.GetTypes())
{
if (type.GetCustomAttributes(typeof(System.Data.Linq.Mapping
.TableAttribute), true) != null)
{
System.ComponentModel.TypeDescriptor.AddProvider(
new Hyper.ComponentModel.HyperTypeDescriptionProvider(
System.ComponentModel.TypeDescriptor.GetProvider(
type)),
type);
}
}
}
While this is for LINQ to SQL, I'm sure an equivalent version could be written for the Entity Framework. This will scan the assembly for any types with the Table attribute and add a custom provider for each of them.

The Expression API that underpins LINQ etc is founded on reflection (MemberInfo), not the component-model (PropertyDescriptor etc) - so there is not a huge requirement for ITypedList. Instead, the contents are inferred from the T in IQueryable<T>, IEnumerable<T> and IList<T> etc.
The closest you might get is IListSource, but that will still just be a shallow wrapper around a proper typed list.
If performance of runtime binding (to PropertyDescriptor) is key, you might want to look at HyperDescriptor - which uses Reflection.Emit and TypeDescriptionProvider to wrap the component-model.
Re the "why" etc; note that in almost all cases with LINQ-to-SQL and EF the Expression (or the "create and set members" part) is going to be compiled to a delegate before it is invoked - so at runtime there is no huge reflection cost. And likewise, with LINQ-to-Objects everything is already compiled (by the C# compiler).

In my experience with LINQ to SQL, I have concluded for a few reasons that LINQ is using reflection to set and get field values in entity class instances. I don't know if I can remember all the reasons, but here's what I can tell you.
If I recall correctly, LINQ does not call any property procedures, but directly sets and reads all the private field values directly, which can only be done via reflection, as far as I know.
The names provided by the MetaData (in the attributes on the entity class properties) provide field name information in string form (if the database column and property name are different, for example). You can conclude from this that LINQ must be using reflection to look up the member to access it.
I think it does this to enforce simplicity -- you can rely on the values in the fields in the entity class directly mirroring the database column values, and there's not much overhead in retrieving the object from the database (populating a newly created entity to correspond to values retrieved from the database should not be routed through property procedures). One thing I have done to represent enumerated values retrieved from the database, for example, is to make some of the LINQ-generated properties private and wrap them in a manually coded property in a partial class that reads and writes the LINQ property.

From looking at what MSDN has to say on LINQ to SQL data binding, it does seem to use IListSource or IBindingList.
Btw, LINQ uses Expression Trees, and that is not reflection as much as meta-programming. Performance should be much better than reflection. Charlie Calvert covers this a bit in this article.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.