Inefficient entity framework queries - c#

I have a following foreach statement:
foreach (var articleId in cleanArticlesIds)
{
var countArt = context.TrackingInformations.Where(x => x.ArticleId == articleId).Count();
articleDictionary.Add(articleId, countArt);
}
Database looks like this
TrackingInformation(Id, ArticleId --some stuff
Article(Id, --some stuff
what I want to do is to get all the article ids count from TrackingInformations Table.
For example:
ArticleId:1 Count:1
ArticleId:2 Count:8
ArticleId:3 Count:5
ArticleId:4 Count:0
so I can have a dictionary<articleId, count>
Context is the Entity Framework DbContext. The problem is that this solution works very slow (there are > 10k articles in db and they should rapidly grow)

Try next query to gather grouped data and them add missing information. You can try to skip Select clause, I don't know if EF can handle ToDictionary in good manner.
If you encounter Select n + 1 problem (huge amount of database requests), you can add ToList() step between Select and ToDictionary, so that all required information will be brought into memory.
This depends all your mapping configuration, environment, so in order to get good performance, you need to play a little bit with different queries. Main approach is to aggregate as much data as possible at database level with few queries.
var articleDictionary =
context.TrackingInformations.Where(trackInfo => cleanArticlesIds.Contains(trackInfo.ArticleId))
.GroupBy(trackInfo => trackInfo.ArticleId)
.Select(grp => new{grp.Key, Count = grp.Count()})
.ToDictionary(info => "ArticleId:" + info.Key,
info => info.Count);
foreach (var missingArticleId in cleanArticlesIds)
{
if(!articleDictionary.ContainsKey(missingArticleId))
articleDictionary.add(missingArticleId, 0);
}

If TrackingInformation is a navigatable property of Article, then you can do this:
var result=context.Article.Select(a=>new {a.id,Count=a.TrackingInformation.Count()});
Putting it into a dictionary is simple as well:
var result=context.Article
.Select(a=>new {a.id,Count=a.TrackingInformation.Count()})
.ToDictionary(a=>a.id,a=>a.Count);
If TrackingInforation isn't a navigatable property, then you can do:
var result=context.Article.GroupJoin(
context.TrackingInformation,
foo => foo.id,
bar => bar.id,
(x,y) => new { id = x.id, Count = y.Count() })
.ToDictionary(a=>a.id,a=>a.Count);

Related

LINQ troubles in C# using Entity Framework

I have a few tables and this is what I need to achieve.
This gets all the rows from one table
var FRA = from prod in _cctDBContext.Fra
where prod.ActTypeId == 1
From within that, I get all the rows where ActTypeID.
Then I need to query another table from with the ID's get from that
foreach (var item in FRA)
{
var FRSA = _cctDBContext.Frsa
.Select(p => new { p.Fraid, p.Frsa1,
p.Frsaid, p.CoreId,
p.RelToEstId, p.ScopingSrc,
p.Mandatory })
.Where(p => p.Fraid == item.Fraid)
.ToList();
}
I then need to push each one of these to Entity Framework. I usually do it this way:
foreach (var item in FRA)
{
var FinanicalReportingActivity = new FinancialReportingActivity { FinancialReportingActivityId = item.Fraid, ScopingSourceType = item.ScopingSrc, Name = item.Fra1, MandatoryIndicator = item.Mandatory, WorkEffortTypeId = 0 };
_clDBContext.FinancialReportingActivity.AddRange(FinanicalReportingActivity);
}
But because I have used 2 for each loops, I cannot get the variables to work because I cannot find a way to get local variables as the entity context.
Can anyone think of a better way to code this?
Thanks
It looks like you can do this as a single join:
var query =
from prod in _cctDBContext.Fra
where prod.ActTypeId == 1
join p in _cctDBContext.Frsa on prod.Fraid equals p.Fraid
select new
{
p.Fraid,
p.Frsa1,
p.Frsaid,
p.CoreId,
p.RelToEstId,
p.ScopingSrc,
p.Mandatory
};
It looks like you are loading data from one set of entities from one database and want to create matching similar entities in another database.
Navigation properties would help considerably here. Frsa appear to be a child collection under a Fra, so this could be (if not already) wired up as a collection within the Fra entity:
Then you only need to conduct a single query and have access to each Fra and it's associated Frsa details. In your case you look to be more interested in the associated FRSA details to populate this ReportingActivity:
var details = _cctDBContext.Fra
.Where(x => x.ActTypeId == 1)
.SelectMany(x => x.Frsa.Select(p => new
{
p.Fraid,
p.Frsa1,
p.Frsaid,
p.CoreId,
p.RelToEstId,
p.ScopingSrc,
p.Mandatory
}).ToList();
though if the relationship is bi-directional where a Fra contains Frsas, and a Frsa contains a reference back to the Fra, then this could be simplified to:
var details = _cctDBContext.Frsa
.Where(x => x.Fra.ActTypeId == 1)
.Select(p => new
{
p.Fraid,
p.Frsa1,
p.Frsaid,
p.CoreId,
p.RelToEstId,
p.ScopingSrc,
p.Mandatory
}).ToList();
Either of those should give you the details from the FRSA to populate your reporting entity.

EF6 IN clause on specific properties

Good morning,
I'm having trouble with a EF query. This is what i am trying to do.
First i am pulling a list of ID's like so (List of IDs are found in the included x.MappingAccts entity):
Entities.DB1.Mapping mapping = null;
using (var db = new Entities.DB1.DB1Conn())
{
mapping = db.Mappings.Where(x => x.Code == code).Include(x => x.MappingAccts).FirstOrDefault();
}
Later, i'm trying to do a query on a different DB against the list of Id's i pulled above (essentially a IN clause):
using (var db = new Entities.DB2.DB2Conn())
{
var accounts = db.Accounts.Where(mapping.MappingAccts.Any(y => y.Id == ?????????)).ToList();
}
As you can see i only got part way with this.
Basically what i need to do is query the Accounts table against it's ID column and pull all records that match mapping.MappingAccts.Id column.
Most of the examples i am finding explain nicely how to do this against a single dimension array but i'm looking to compare specific columns.
Any assist would be awesome.
Nugs
An IN clause is generated using a IEnumerable.Contains.
From the first DB1 context, materialize the list of Id's
var idList = mapping.MappingAccts.Select(m => m.Id).ToList();
Then in the second context query against the materialized list of id's
var accounts = db.Accounts
.Where(a => idList.Contains(a.Id))
.ToList();
The only problem you may have is with the amount of id's you are getting in the first list. You may hit a limit with the SQL query.
This will give the list of Accounts which have the Ids contained by MappingAccts
using (var db = new Entities.DB2.DB2Conn())
{
var accounts = db.Accounts.Where(s => mapping.MappingAccts.Any(y => y.Id == s.Id)).ToList();
}

Read most recently inserted rows by Date using linq to Entity framework

I have a log table in my db and wants to fetch only those records which are added most recently based on the column name RowCreateDate, this is how I am trying to achieve the records which is bringing the rows from the db but I feel may be there is a better way to achieve the same.
using (var context = new DbEntities())
{
// get date
var latestDate = context.Logs.Max(o => o.RowCreateDate);
if(latestDate!=null)
{
lastDate = new DateTime(latestDate.Value.Year, latestDate.Value.Month, latestDate.Value.Day,00,00,00);
logs = context.Logs.Where( o.RowCreateDate >= lastDate).ToList();
}
}
What i need to know I am doing right or there would another better way?
Yet another option:
context.Logs.Where(c => DbFunctions.TruncateTime(c.RowCreateDate) == DbFunctions.TruncateTime(context.Logs.Max(o => o.RowCreateDate)))
This reads explicitly like what you want (get all rows with date equals max date) and will also result in one query (not two as you might have expected).
You can't simplify this code because LINQ to Entities does not support TakeWhile method.
You can use
using (var context = new DbEntities())
{
// get date
var latestDate = context.Logs.Max(o => o.RowCreateDate);
if(latestDate!=null)
{
lastDate = new DateTime(latestDate.Value.Year, latestDate.Value.Month, latestDate.Value.Day,00,00,00);
logs = context.Logs
.OrderBy(o => o.RowCreateDate)
.AsEnumerable()
.TakeWhile(o => o.RowCreateDate >= lastDate);
}
}
BUT it takes all your data from DB, which is not very good and I do not recommend it.
I think this will do (if we assume you want to get top 3 most recent record):
var topDates = context.Logs.OrderByDescending(x=>x.RowCreateDate).Take(3)
First, I think that your code is fine. I don't see the problem with the two queries. But if you want to simplify it you use TruncateTime, like this:
IGrouping<DateTime?, Logs> log =
context.Logs.GroupBy(x => DbFunctions.TruncateTime(x.RowCreateDate))
.OrderByDescending(x => x.Key).FirstOrDefault();
It will return a grouped result with the logs created during the last day for RowCreateDate.

Join to an in-memory list efficiently

In EF, if I have a list of primatives (List), "joining" that against a table is easy:
var ids = int[]{1,4,6}; //some random values
var rows = context.SomeTable.Where(r => ids.Contains(r.id))
This gets much more complicated the instant you want to join on multiple columns:
var keys = something.Select(s => new { s.Field1, s.Field2 })
var rows = context.SomeTable.Where(r => keys.Contains(r => new { s.Field1, s.Field2 })); // this won't work
I've found two ways to join it, but neither is great:
Suck in the entire table, and filtering it based on the other data. (this gets slow if the table is really large)
For each key, query the table (this gets slow if you have a decent number of rows to pull in)
Sometimes, the compromise I've been able to make is a modified #1: pulling in subset of the table based on a fairly unique key
var keys = something.Select(s => s.Field1)
var rows = context.SomeTable.Where(r => keys.Contains(s.Field1)).ToList();
foreach (var sRow in something)
{
var joinResult = rows.Where(r => r.Field1 == sRow.Field1 && r.Field2 == sRow.Field2);
//do stuff
}
But even this could pull back too much data.
I know there are ways to coax table valued parameters into ADO.Net, and ways I can build a series of .Where() clauses that are OR'd together. Does anyone have any magic bullets?
Instead of a .Contains(), how about you use an inner join and "filter" that way:
from s in context.SomeTable
join k in keys on new {k.Field1, k.Field2} equals new {s.Field1, s.Field2}
There may be a typo in the above, but you get the idea...
I got exactly the same problem, and the solutions I came up with were:
Naive: do a separate query for each local record
Smarter: Create 2 lists of unique Filed1 values and unique Fiels2 values, query using 2 contains expressions and then you will have to double filter result as they might be not that accurate.
Looks like this:
var unique1 = something.Select(x => x.Field1).Distinct().ToList();
var unique2 = something.Select(x => x.Field2).Distinct().ToList();
var priceData = rows.Where(x => unique1.Contains(x.Field1) && unique2.Contains(x.Field2));
Next one is my own solution which I called BulkSelect, the idea behind it is like this:
Create temp table using direct SQL command
Upload data for SELECT command to that temp table
Intercept and modify SQL which was generated by EF.
I did it for Postgres, but this may be ported to MSSQL is needed. This nicely described here and the source code is here
You can try flattening your keys and then using the same Contains pattern. This will probably not perform great on large queries, although you could use function indexes to store the flattened key in the database...
I have table Test with columns K1 int, K2 int, Name varchar(50)
var l = new List<Tuple<int, int>>();
l.Add(new Tuple<int, int>(1, 1));
l.Add(new Tuple<int, int>(1, 2));
var s = l.Select(k => k.Item1.ToString() + "," + k.Item2.ToString());
var q = Tests.Where(t => s.Contains(t.K1.ToString() + "," + t.K2.ToString()));
foreach (var y in q) {
Console.WriteLine(y.Name);
}
I've tested this in LinqPad with Linq to SQL
First attempt that didn't work:
I think the way to write it as a single query is something like this
var keys = something.Select(s => new { s.Field1, s.Field2 })
var rows = context.SomeTable.Where(r => keys.Any(k => r.Field1 == k.Field1 && r.Field2 == k.Field2));
Unfortunately I don't have EF on this laptop and can't even test if this is syntactically correct.
I've also no idea how performant it is if it works at all...
var rows =
from key in keys
join thingy in context.SomeTable
on 1 = 1
where thingy.Field1 == key && thingy.Field2 == key
select thingy
should work, and generate reasonable SQL

What is the recommended practice to update or delete multiple entities in EntityFramework?

In SQL one might sometimes write something like
DELETE FROM table WHERE column IS NULL
or
UPDATE table SET column1=value WHERE column2 IS NULL
or any other criterion that might apply to multiple rows.
As far as I can tell, the best EntityFramework can do is something like
foreach (var entity in db.Table.Where(row => row.Column == null))
db.Table.Remove(entity); // or entity.Column2 = value;
db.SaveChanges();
But of course that will retrieve all the entities, and then run a separate DELETE query for each. Surely that must be much slower if there are many entities that satisfy the criterion.
So, cut a long story short, is there any support in EntityFramework for updating or deleting multiple entities in a single query?
EF doesn't have support for batch updates or deletes but you can simply do:
db.Database.ExecuteSqlCommand("DELETE FROM ...", someParameter);
Edit:
People who really want to stick with LINQ queries sometimes use workaround where they first create select SQL query from LINQ query:
string query = db.Table.Where(row => row.Column == null).ToString();
and after that find the first occurrence of FROM and replace the beginning of the query with DELETE and execute result with ExecuteSqlCommand. The problem with this approach is that it works only in basic scenarios. It will not work with entity splitting or some inheritance mapping where you need to delete two or more records per entity.
Take a look to Entity Framework Extensions (Multiple entity updates). This project allow set operations using lambda expressions. Samples from doc:
this.Container.Devices.Delete(o => o.Id == 1);
this.Container.Devices.Update(
o => new Device() {
LastOrderRequest = DateTime.Now,
Description = o.Description + "teste"
},
o => o.Id == 1);
Digging EFE project source code you can see how automatize #Ladislav Mrnka second approach also adding setting operations:
public override string GetDmlCommand()
{
//Recover Table Name
StringBuilder updateCommand = new StringBuilder();
updateCommand.Append("UPDATE ");
updateCommand.Append(MetadataAccessor.GetTableNameByEdmType(
typeof(T).Name));
updateCommand.Append(" ");
updateCommand.Append(setParser.ParseExpression());
updateCommand.Append(whereParser.ParseExpression());
return updateCommand.ToString();
}
Edited 3 years latter
Take a look to this great answer: https://stackoverflow.com/a/12751429
Entity Framework Extended Library helps to do this.
Delete
//delete all users where FirstName matches
context.Users.Delete(u => u.FirstName == "firstname");
Update
//update all tasks with status of 1 to status of 2
context.Tasks.Update(
t => t.StatusId == 1,
t2 => new Task {StatusId = 2});
//example of using an IQueryable as the filter for the update
var users = context.Users.Where(u => u.FirstName == "firstname");
context.Users.Update(users, u => new User {FirstName = "newfirstname"});
https://github.com/loresoft/EntityFramework.Extended

Categories

Resources