Using Entity Framework and LINQ, how might I achieve this TSQL:
SELECT Children.ChildCount, Parent.*
FROM Parent
LEFT JOIN (SELECT ParentID, COUNT(ChildID) AS ChildCount FROM Child GROUP BY ParentID) AS Children
ON Parent.ID = Children.ParentID
Note that this is just a small part of what is already a larger LINQ query that includes other related entities so using a RawSQL query is not an option. Also the Parent table has around 20 columns and I'm hoping not to have to specify each individually to keep the code maintainable.
EDIT: To clarify a couple of things, the model for the output of the query (very simplified) looks something like this:
public class MyEntity
{
public int ID {get; set;}
public string Name {get; set;}
public int ChildCount {get; set;}
// many other properties here including related records
}
So what I'm trying to do is get the ChildCount included in the result of the query so it is included in the EF entity.
You can use a Select Query to project the DB info onto an entity, something like:
var entity = db.Parent.Select(x =>
new MyEntity
{
Id = x.Id,
Name = x.Name,
ChildCount = x.Children
.Select(y => y.ParentId == x.Id)
.Count()
})
.SingleOrDefault(x => x.Id == IDYouNeedToQuery);
}
What this should do is return you 1 instance of your MyEntity class with the Name, ID, and ChildCount properties filled in. Your SQL won't quite match what is generated but this should get you what you want. BTW you can also sub the SingleOrDefault line with a filter of another type, or no filter in which case the entity variable becomes a collection of MyEntity.
For further reading on this technique and how to use AutoMapper to make it super easy to set up, check out this post from Jon P Smith, who literally wrote the book on Entity Framework Core.
Hope this helps.
For anyone who comes across this at a later date, I ended up using AutoMapper as suggested by Nik P (I was already using AutoMapper anyway) to map from db entities to DTOs.
In effect, in my AutoMapper mapping I have:
CreateMap<MyEntity, MyEntityDTO>()
.ForMember(d => d.ChildCount, opt => opt.MapFrom(src => src.ChildEntity.Count()))
Then in my Service layer I call:
IEnumerable<MyEntityDTO> results = await dbContext.MyEntities.ProjectTo<MyEntityDTO>(mapper.ConfigurationProvider).ToListAsync();
I have the following code and would like to know if there is a way to refactor in order to remove duplicated logic.
This results current user with eager loading.
var currentEmployee = RosterContext.Employees
.Where(e => e.User.Id == id)
.Include(e => e.Job.Department).FirstOrDefault();
.
var job = RosterContext.Employees.Where(e=>e.Job.Department.Id == currentEmployee.Job.DepartmentId).ToList();
I created another same context which compares the first line of code to result all employee names who work in same department. My question is, as I am using two linq expression that uses the same context (Employees) am i able to combine both linq queries into one?
It may become a long linq expression but it should serve on getting the current user object followed by comparing user object to get all employees that share the same department id?
It makes sense to try an ORM framework, such as Entity Framework or NHibernate.
ORM framewok will model database FK relationship as a scalar property on one side and vector property (collection) on the other side of the relationship.
For instance Department would have a collection property Jobs, and a Job entity would have a scalar Department property.
DB queries with joins on FK become just dependency property navigation, for example - to access the list of employees in current department you would just return something like employee.Department.Employees - that is, assuming your entities are all loaded (which is rather simple to achieve in EF, using include statement)
In Entity Framework you have the using clause to attach children. So for example in pure EF you could do:
var department = context.Department.Include(d => d.Jobs).First(d => d.DepartmentId == departmentId);
https://msdn.microsoft.com/en-us/library/gg671236%28v=vs.103%29.aspx#Anchor_1
With a repository, you may need to do something like this:
EF Including Other Entities (Generic Repository pattern)
EF Code First supports relationships out of the box. You can either use the conventions or explicitly specify the relationship (for example, if the foreign key property is named something weird). See here for example: https://msdn.microsoft.com/en-us/data/hh134698.aspx
When you've configured your models right, you should be able to access department like so:
var currentUser = _unitOfWork.Employee.GetEmployeeByID(loggedInUser.GetUser(user).Id);
var job = currentUser.Job;
var department = job.Department;
// or
var department = _unitOfWork.Employee.GetEmployeeByID(loggedInUser.GetUser(user).Id).Job.Department;
To show all employees that work in the same department:
var coworkers = department.Jobs.SelectMany(j => j.Employees);
Update
To use eager loading with a single repository class (you shouldn't need multiple repository classes in this instance, and therefore don't need to use Unit of Work):
public class EmployeeRepository {
private readonly MyContext _context = new MyContext(); // Or whatever...
public IList<Employee> GetCoworkers(int userId) {
var currentEmployee = _context.Employees
.Where(e => e.UserId == userId)
.Include(e => e.Job.Department) // Use eager loading; this will fetch Job and Department rows for this user
.FirstOrDefault();
var department = currentEmployee.Job.Department;
var coworkers = department.Jobs.SelectMany(j => j.Employees);
return coworkers;
}
}
And call it like so...
var repo = new EmployeeRepository();
var coworkers = repo.GetCoworkers(loggedInUser.GetUser(user).Id);
You probably would be able to make the repository query more efficient by selecting the job and department of the current user (like I've done) and then the related jobs and employees when coming back the other way. I'll leave that up to you.
IN EF6, i have an entity Customer, with a navigation property to entity Address. Address entity contains a property "City".
I can eager load the Address entity while getting all Customers like this:
_dbSet.Customers.Include(customer => customer.Address);
This gives me all the customers, with all the Address properties eager loaded.
Of course this works fine, but the only thing i need from the Address table is the field "City", and it does not feel good to fetch all the address properties from the persistent data store (SQL Server) while not needing them.
I tried the following:
_dbSet.Customers.Include(customer => customer.Address.City);
...but this gives me a runtime exception:
An unhandled exception of type 'System.InvalidOperationException' occurred in mscorlib.dll
Additional information: A specified Include path is not valid. The EntityType 'MyModel.Address'does not declare a navigation property with the name 'City'.
I understand this, since City is just a field, and not a relation to another table / entity.
But is there another way to accomplish what i want, or is it best practice to just include the whole Address entity, even if i only need the city field???
What i want is that i can use myCustomer.Address.City, without having an extra query to the database, but for examle when i use myCustomer.Address.Street, the Street property is not eager loaded, and should be additionally fetched from the database...
Select only the properties you want, EF will only load what's needed.
var query = _dbSet.Customers.Include(customer => customer.Address);
var data = query.Select(c => new { Customer = c, City = c.Address.City });
If you are really set on using the same entity throughout your code base, then you could get around the issue using something similar to what Stef proposed:
var query = _dbSet.Customers.Include(customer => customer.Address);
var data = query
.Select(c => new { Customer = c, City = c.Address.City })
.ToList() //executes the IQueryable, and fetches the Customer and City (only) from the DB
.ForEach(x => x.Customer.Address = new Address { City = x.City })
.Select(x => x.Customer)
.ToList();
I am very much in favour of DTOs and not using entity objects in the whole code base, but the above will give you a list of Customers which have Address objects with only the City field populated. Obviously, I make the assumption that your objects have public setters, which entity objects typically do have.
I have an ASP.net MVC controller action that is retrieving a list of items from my entity model. For this entity, there are a few properties that aren't in the entity itself. I created a partial class to add these properties:
public partial class Person
{
public int Extra
{
get
{
using( var ctx = new DBEntities() )
{
return ctx.OtherTable.Count(p => p.PersonID == this.PersonID);
}
}
}
}
As you can see, the property that I need access to comes from another table. In my MVC page...I need to return a large number of people (100+ per page). In order to display this Extra field, each Person entity is going to be hitting the database separately...which has been extremely inefficient. I have one query to return all the people, and then for each person it has a query for each property I have like this. This can end up being 300 calls to the database, which is taking a long time to execute.
What is a better way to do this? I would ideally like to execute one query that returns all the People and the extra data, but I would also like the extra data to be part of the Person entity, even if it is in a separate table in the database.
Update
To add a little more context from the comments.
I am returning the People from a repository class. I was told in another question that the repository should only be dealing with the entities themselves. So the code that retrieves the people is like:
class PersonRepository
{
public IQueryable<Person> GetPeople() {
return from p in db.People
where p ...
select p;
}
}
I don't really have the option to join in that case.
You could do joins:
var result =
from p in ctx.Persons
from a in ctx.OtherTable
where p.PersonID == personID
select new SomeViewModel
{
Name = p.Person.Name,
OtherTableValue = a.OtherValue
};
I'm not sure about how your database design is done. But why can't you join the data from the two related tables and hit the data once rather then multiple times?
Even if that is slow for you, you can also cache this data and be able to access it during the lifetime of the session.
This is easy for me to perform in TSQL, but I'm just sitting here banging my head against the desk trying to get it to work in EF4!
I have a table, lets call it TestData. It has fields, say: DataTypeID, Name, DataValue.
DataTypeID, Name, DataValue
1,"Data 1","Value1"
1,"Data 1","Value2"
2,"Data 1","Value3"
3,"Data 1","Value4"
I want to group on DataID/Name, and concatenate DataValue into a CSV string. The desired result should contain -
DataTypeID, Name, DataValues
1,"Data 1","Value1,Value2"
2,"Data 1","Value3"
3,"Data 1","Value4"
Now, here's how I'm trying to do it -
var query = (from t in context.TestData
group h by new { DataTypeID = h.DataTypeID, Name = h.Name } into g
select new
{
DataTypeID = g.Key.DataTypeID,
Name = g.Key.Name,
DataValues = (string)g.Aggregate("", (a, b) => (a != "" ? "," : "") + b.DataValue),
}).ToList()
The problem is that LINQ to Entities does not know how to convert this into SQL. This is part of a union of 3 LINQ queries, and I'd really like it to keep it that way. I imagine that I could retrieve the data and then perform the aggregate later. For performance reasons, that wouldn't work for my app. I also considered using a SQL server function. But that just doesn't seem "right" in the EF4 world.
Anyone care to take a crack at this?
If the ToList() is part of your original query and not just added for this example, then use LINQ to Objects on the resulting list to do the aggregation:
var query = (from t in context.TestData
group t by new { DataTypeID = t.DataTypeID, Name = t.Name } into g
select new { DataTypeID = g.Key.DataTypeID, Name = g.Key.Name, Data = g.AsEnumerable()})
.ToList()
.Select (q => new { DataTypeID = q.DataTypeID, Name = q.Name, DataValues = q.Data.Aggregate ("", (acc, t) => (acc == "" ? "" : acc + ",") + t.DataValue) });
Tested in LINQPad and it produces this result:
Some of the Answers suggest calling ToList() and then perform the calculation as LINQ to OBJECT. That's fine for a little amount of data, but what if I have a huge amount of data that I do not want to load into memory too early, then, ToList() may not be an option.
So, the better idea would be to process/format the data in the presentation layer and let the Data Access layer do only loading or saving raw data that SQL likes.
Moreover, in your presentation layer, most probably you are filtering the data by paging, or maybe you are showing one row in the details page, so, the data you will load into the memory is likely smaller than the data you load from the database. (Your situation/architecture may be different,.. but I am saying, most likely).
I had a similar requirement. My problem was to get the list of items from the Entity Framework object and create a formatted string (comma separated value)
I created a property in my View Model which will hold the raw data from the repository and when populating that property, the LINQ query won't be a problem because you are simply querying what SQL understands.
Then, I created a get-only property in my ViewModel which reads that Raw entity property and formats the data before displaying.
public class MyViewModel
{
public IEnumerable<Entity> RawChildItems { get; set; }
public string FormattedData
{
get
{
if (this.RawChildItems == null)
return string.Empty;
string[] theItems = this.RawChildItems.ToArray();
return theItems.Length > 0
? string.Format("{0} ( {1} )", this.AnotherRegularProperty, String.Join(", ", theItems.Select(z => z.Substring(0, 1))))
: string.Empty;
}
}
}
Ok, in that way, I loaded the Data from LINQ to Entity to this View Model easily without calling.ToList().
Example:
IQueryable<MyEntity> myEntities = _myRepository.GetData();
IQueryable<MyViewModel> viewModels = myEntities.Select(x => new MyViewModel() { RawChildItems = x.MyChildren })
Now, I can call the FormattedData property of MyViewModel anytime when I need and the Getter will be executed only when the property is called, which is another benefit of this pattern (lazy processing).
An architecture recommendation: I strongly recommend to keep the data access layer away from all formatting or view logic or anything that SQL does not understand.
Your Entity Framework classes should be simple POCO that can directly map to a database column without any special mapper. And your Data Access layer (say a Repository that fetches data from your DbContext using LINQ to SQL) should get only the data that is directly stored in your database. No extra logic.
Then, you should have a dedicated set of classes for your Presentation Layer (say ViewModels) which will contain all logic for formatting data that your user likes to see. In that way, you won't have to struggle with the limitation of Entity Framework LINQ. I will never pass my Entity Framework model directly to the View. Nor, I will let my Data Access layer creates the ViewModel for me. Creating ViewModel can be delegated to your domain service layer or application layer, which is an upper layer than your Data Access Layer.
Thanks to moi_meme for the answer. What I was hoping to do is NOT POSSIBLE with LINQ to Entities. As others have suggested, you have to use LINQ to Objects to get access to string manipulation methods.
See the link posted by moi_meme for more info.
Update 8/27/2018 - Updated Link (again) - https://web.archive.org/web/20141106094131/http://www.mythos-rini.com/blog/archives/4510
And since I'm taking flack for a link-only answer from 8 years ago, I'll clarify just in case the archived copy disappears some day. The basic gist of it is that you cannot access string.join in EF queries. You must create the LINQ query, then call ToList() in order to execute the query against the db. Then you have the data in memory (aka LINQ to Objects), so you can access string.join.
The suggested code from the referenced link above is as follows -
var result1 = (from a in users
b in roles
where (a.RoleCollection.Any(x => x.RoleId = b.RoleId))
select new
{
UserName = a.UserName,
RoleNames = b.RoleName)
});
var result2 = (from a in result1.ToList()
group a by a.UserName into userGroup
select new
{
UserName = userGroup.FirstOrDefault().UserName,
RoleNames = String.Join(", ", (userGroup.Select(x => x.RoleNames)).ToArray())
});
The author further suggests replacing string.join with aggregate for better performance, like so -
RoleNames = (userGroup.Select(x => x.RoleNames)).Aggregate((a,b) => (a + ", " + b))
You are so very close already. Try this:
var query = (from t in context.TestData
group h by new { DataTypeID = h.DataTypeID, Name = h.Name } into g
select new
{
DataTypeID = g.Key.DataTypeID,
Name = g.Key.Name,
DataValues = String.Join(",", g),
}).ToList()
Alternatively, you could do this, if EF doesn't allow the String.Join (which Linq-to-SQL does):
var qs = (from t in context.TestData
group h by new { DataTypeID = h.DataTypeID, Name = h.Name } into g
select new
{
DataTypeID = g.Key.DataTypeID,
Name = g.Key.Name,
DataValues = g
}).ToArray();
var query = (from q in qs
select new
{
q.DataTypeID,
q.Name,
DataValues = String.Join(",", q.DataValues),
}).ToList();
Maybe it's a good idea to create a view for this on the database (which concatenates the fields for you) and then make EF use this view instead of the original table?
I'm quite sure it's not possible in a LINQ statement or in the Mapping Details.