How to cancel asynchronuos action in LINQ query? - c#

This question extends my previous one Asynchronuos binding and LINQ query hangs. Assume I have a LINQ such us:
var query = from var item in items where item.X == 1 select item;
I can iterate throughout the query asynchronuosly and dispatch each item to UI (or I may use IProgress):
foreach(var item in query)
{
Application.Current.Dispatcher.BeginInvoke(
new Action(() => source.Add(item)));
}
Now I would like to cancel the query... I can simply declare a CancellactionTokenSource cts, put a token into a task and then:
foreach(var item in query)
{
cts.Token.ThrowIfCancellationRequested();
Application.Current.Dispatcher.BeginInvoke(
new Action(() => source.Add(item)));
}
The trouble is, that I'm able to cancel only when new result appears. So if there is a long chain of items, that don't meet my query condition, my cancel request is ignored.
How to involve cancellation into LINQ (to objects) and be able to check the cancel token for each item?

I'm not sure as I didn't test it... But I think you could put it as side-effect inside your linq query, maybe creating a method inside your where to do so, such as:
Change this:
var query = from var item in items where item.X == 1 select item;
To:
var query = from var item in items where CancelIfRequestedPredicate(item,i=>i.X == 1) select item;
And create a method:
private bool CancelIfRequestedPredicate<T>(T item,Predicate<T> predicate)
{
cts.Token.ThrowIfCancellationRequested();
return predicate(item);
}
Since linq uses deferred execution, I think it will run your method at each iteration.
Just as an observation, I don't know what will be the behavior if you're not using Linq to Objects (You didn't mention if you're using linq to sql, or something like this). But it probably won't work.
Hope it helps.

For my specific to Entity Framework version of this problem:
I've been digging around for a while today trying to find an answer for this when I finally found something (one of the 30ish pages I visited) that was a clear answer (for my)
issue which is specifically a linq query running against entity framework.
for later versions of Entity Framework (as of now)
there are extension methods for ToListAsync which include an overload that take a cancellation token.
as does task (in my case my query was in a task) I ran the query in, but it was the data query I was most concerned about.
var sourceToken = new System.Threading.CancellationTokenSource();
var t2 = System.Threading.Tasks.Task.Run(() =>
{
var token = sourceToken.Token;
return context.myTable.Where(s => s.Price == "Right").Select(i => i.ItemName).ToListAsync(token);
}
, sourceToken.Token
);

Related

Eager loading is not retrieving related records with the first query

I have this EF query:
IQueryable<Repositories.Activo> activos = db.Activo.Include(a => a.ValorCampo)
.Where(a1 => a1.Carga.ClienteId == ClienteID && !hiddenWorkflow.Any(e => e.EstadoId == a1.EstadoId));
Where ValorCampo is a child table.
Well.... then, later in the program, I have this:
var dashboard = activos.GroupBy(a => a.Localidad);
foreach (var activo in dashboard.Skip(model.start).Take(model.length))
{
foreach(var campo in campos)
{
var totalEstado = activo.Where(a => a.ValorCampo.Any(vc => vc.ValorCampoCaracter == vc.Campo.CampoDashboard && vc.Campo.CampoDashboard == campo.CampoDashboard)).Count();
}
}
well, by adding "Include" to the query, I thought the actual executed query will include "ValorCampo" table, but that was not true. Or, doesn't it work when grouping? I don't think so.
The query to the related table is run for every iteration in the foreach loop. That is inefficient. I need the related information data to be loaded into memory with the first query.
How can I do it?
Regards
Jaime
According to this question in Stackoverflow the problem is the GroupBy.
Here is the same issue explained: GrouBy and Include
There are two work arounds in the question one is to use ToList as described here and the other is to place the Include after the GroupBy.
Try something like this:
var dashboard = activos.ToList().GroupBy(a => a.Localidad);
Notice how I called ToList() to materialize the query....
To clarify GroupBy uses deferred execution which means it won't materialize the query.... Also Skip and Take use deferred execution which will also NOT materialize the query:
Here is some documentataion from Microsoft:
GroupBy
Skip
Take
All of them use deferred execution.
So you need to call ToList, FirstOrDefault, Single... etc one of the methods that materializes the query.

IEnumerable takes too long to process when filtering on it

I have a feeling I know that what the reason is for this behavior, but I don't know what the best way of resolving it will be.
I have built a LinqToSQL query:
public IEnumerable<AllConditionByCountry> GenerateConditions(int paramCountryId)
{
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
}).OrderBy(x => x.CountryID).AsEnumerable<AllConditionByCountry>();
return AllConditionsByCountry;
}
This query returns about 9500+ rows of data.
I'm calling this from my Controller like so:
svcGenerateConditions generateConditions = new svcGenerateConditions(db);
IEnumerable<AllConditionByCountry> AllConditionsByCountry;
AllConditionsByCountry = generateConditions.GenerateConditions(1);
Which then I'm looping through:
foreach (var record in AllConditionsByCountry)
{
...
...
...
This is where I think the issue is:
var rList = AllConditionsByCountry
.Where(x => x.ConditionID == conditionID)
.Select(x => x)
.AsEnumerable();
I'm doing an nested loop based off the data that I'm gathering from the above query (utilizing the original data I'm getting from AllConditionByCountry. I think this is where my issue lies. When it is doing the filter on the data, it SLOWS down greatly.
Basically this process writes out a bunch of files (.json, .html)
I've tested this at first using just ADO.Net and to run through all of these records it took about 4 seconds. Using EF (stored procedure or LinqToSql) it takes minutes.
Is there anything I should do with my types of lists that I'm using or is that just the price of using LinqToSql?
I've tried to return List<AllConditionByCountry>, IQueryable, IEnumerable from my GenerateConditions method. List took a very long time (similar to what I'm seeing now). IQueryable I got errors when I tried to do the 2nd filter (Query results cannot be enumerated more than once).
I have run this same Linq statement in LinqPad and it returns in less than a second.
I'm happy to add any additional information.
Please let me know.
Edit:
foreach (var record in AllConditionsByCountry)
{
...
...
...
var rList = AllConditionsByCountry
.Where(x => x.ConditionID == conditionID)
.Select(x => x)
.AsEnumerable();
conditionDescriptionTypeID = item.ConditionDescriptionTypeId;
id = conditionDescriptionTypeID + "_" + count.ToString();
...
...
}
TL;DR: You're making 9895 queries against the database instead of one. You need to rewrite your query such that only one is executed. Look into how IEnumerable works for some hints into doing this.
Ah, yeah, that for loop is your problem.
foreach (var record in AllConditionsByCountry)
{
...
...
...
var rList = AllConditionsByCountry.Where(x => x.ConditionID == conditionID).Select(x => x).AsEnumerable();
conditionDescriptionTypeID = item.ConditionDescriptionTypeId;
id = conditionDescriptionTypeID + "_" + count.ToString();
...
...
}
Linq-to-SQL works similarly to Linq in that it (loosely speaking) appends functions to a chain to be executed when the enumerable is iterated - for example,
Enumerable.FromResult(1).Select(x => throw new Exception());
This doesn't actually cause your code to crash because the enumerable is never iterated. Linq-to-SQL operates on a similar principle. So, when you define this:
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
}).OrderBy(x => x.CountryID).AsEnumerable<AllConditionByCountry>();
You're not executing anything against a database, you're just instructing C# to build a query that does this when it is iterated. That's why just declaring this query is fast.
Your problem comes when you get to your for loop. When you hit your for loop, you signal that you want to start iterating the AllConditionsByCountry iterator. This causes .NET to go off and execute the initial query, which takes time.
When you call AllConditionsByCountry.Where(x => x.ConditionID == conditionID) in the for loop, you're constructing another iterator that doesn't actually do anything. Presumably you actually use the result of rList within that loop, however, you're essentially constructing N queries to be executed against the database (where N is the size of AllConditionsByCountry).
This leads to a scenario where you are effectively executing approximately 9501 queries against the database - 1 for your initial query and then one query for each element within the original query. The drastic slowdown compared to ADO.NET is because you're probably making 9500 more queries than you were originally.
You ideally should change the code so that there is one and only one query executed against the database. You've a couple of options:
Rewrite the Linq-to-SQL query such that all of the legwork is done by the SQL database
Rewrite the Linq-to-SQL query so it looks like this
var conditions = AllConditionsByCountry.ToList();
foreach (var record in conditions)
{
var rList = conditions.Where(....);
}
Note that in that example I am searching conditions rather than AllConditionsByCountry - .ToList() will return a list that has already been iterated so you do not create any more database queries. This will still be slow (since you're doing O(N^2) over 9500 records), but it will still be faster than creating 9500 queries since it will all be done in memory.
Just rewrite the query in ADO.NET if you're more comfortable with raw SQL than Linq-to-SQL. There's nothing wrong with this.
I think I should point out what methods cause an IEnumerable to be iterated and what ones don't.
Any method named As* (such as AsEnumerable<T>()) do not cause the enumerable to be iterated. It's essentially a way of casting from one type to another.
Any method named To* (such as ToList<T>()) will cause the enumerable to be iterated. In the event of Linq-to-SQL this will also execute the database query. Any method that also results in you getting a value out of the enumerable will also cause iteration. You can use this to your advantage by creating a query and forcing iteration using ToList() and then searching that list - this will cause the comparisons to be done in memory, which is what I demo above
//Firstly: IEnumerable<> should be List<>, because you need to massage result later
public IEnumerable<AllConditionByCountry> GenerateConditions(int paramCountryId)
{
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
})
.OrderBy(x => x.CountryID)
.ToList() //return a list, so only 1 query is executed
//.AsEnumerable<AllConditionByCountry>();//it's useless code, anyway.
return AllConditionsByCountry;
}
about this part:
foreach (var record in AllConditionsByCountry) // you can use AllConditionsByCountry.ForEach(record=>{...});
{
...
//AllConditionsByCountry will not query db again, because it's a list, no long a query
var rList = AllConditionsByCountry.Where(x => x.ConditionID == conditionID);//.Select(x => x).AsEnumerable(); //no necessary to use AsXXX if compilation do not require it.
...
}
BTW,
you should have your result paged, no page will need 100+ result. 10K return is the issue itself.
GenerateConditions(int paramCountryId, int page = 0, int pagesize = 50)
it's weird that you have to use a sub-query, usually it means GenerateConditions did not return the data structure you need, you should change it to give right data, no more subquery
use compiled query to improve more: https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/compiled-queries-linq-to-entities
we don't see your full query, but usually, it's right the part you should improve, especially when you have many conditions to filter and join and group... a little change could make all differences.

Advanced filtering in linq

I'm here to ask a question on how would a code look like when it came to advanced filtering in C# with LINQ. I have experience with Linq, but this is something that was out of my understanding.
Lets say we have a class Item that has properties (string)Name, (bool)New and (int)Price and users would have to input their filters and get the results they need.
Lets say we put 5 objects inside a list list that is a List of Items.
new Item("Pen",true,12);
new Item("PostIt",false,35);
new Item("Phone",true,140);
new Item("Watch",true,5);
new Item("Lavalamp",false,2);
Now I woud like to process this information to get.. All New times that cost over 10. I know I can do this with
List<Item> Results = list.where(item => item.Price> 10 && item.New).ToList();
but what if a user wants to get all items that cost over 10 regardless of being new or not.. I can't change the query during runtime to fit the needs and I don't think that making a query for every possible combination of input parameters is a right way to do this... Can someone give me an example on how this should be done?
You can define base query
var result = list.Where(item=> item.Price > 10); //DON'T Call ToList() here
if(someCondition)
result = result.Where(item=> item.New);
//in the end you are calling
return result.ToList();
Like #MikeEason said you don't want to call ToList() on your first result because this will execute the query. Your goal is to build the complex query and execute it only once. Because of that this is done when you return the result.
If you only have those three conditions then you can build your query in several steps:
IEnumerable<Item> result=list;
int Price=10;
bool FilterByPrice, bool FilterByNew;//Set this variables in your environment
if(FilterByPrice)
result=result.Where(item => item.Price> Price);
if(FilterByNew)
result=result.Where(item => item.New);
Your query will be executed when you call ToList method or went you iterate over the query result thanks to deferred execution.
So let's say your items exist in your database and you want to query them. The user has a checkbox, if he wants to see only new items or all of them. If the box is checked you set a bool value for it.
// Compose the query
var results = _db.Where(item => item.Price > 10 );
// Still composing
if (onlyNewItems)
{
results = results.Where(item => item.New);
}
// ToList() executes the query, data is returned;
return results.ToList();
This does not run the query twice. In fact, until you materialize your query, you are still composing it. If you would return it now, it would be of type IQueryable<T>. Only after you call .ToList(), is your query actually executed and you get an IEnumerable<T> in this case a List<T> back.
List<Item> Results = list.where(item => item.Price > 10
&& (condition ? item.New : true)).ToList();
you can extend this way. just pass true if your condition is false and it is like nothing is happened.

Entity Framework: Update inside LINQ query

I've came across this idea of updating a table inside of a LINQ query instead of making the query first, and updating each object returned by that query.
For instance, it is possible to change the value of any property associated with x inside of this query:
var Query = from x in EFContext.SomeTable
where x.id == 1
// SET X = Model or x.Name = "NewName"
select SaveChanges();
Could something like this be done at all?
From MSDN:
In a query that returns a sequence of values, the query variable itself
never holds the query results and only stores the query commands.
Execution of the query is deferred until the query variable is
iterated over in a foreach or for loop. This is known as deferred
execution; that is, query execution occurs some time after the query
is constructed. This means that you can execute a query as frequently
as you want to. This is useful when, for example, you have a database
that is being updated by other applications. In your application, you
can create a query to retrieve the latest information and repeatedly
execute the query, returning the updated information every time.
So, your query will be executed when you do the foreach to update your entities. As #recursive said, LINQ is useful when you need to do a query over a collection, not to update data specifically.
As an aditional info, you can also force immediate execution. This is useful when you want to cache the results of a query,for example, when you want to use some functionalities that Linq to Entities doesn't support. To force immediate execution of a query that does not produce a singleton value, you can call the ToList method, the ToDictionary method, or the ToArray method on a query or query variable.
I believe the best possible way to do so would be to write an extension method which can be done by creating a static class:
public static class Extensions
{
public static IEnumerable<T> Remove<T>(this DbSet<T> Input, Func<T, Boolean> Objects) where T : class
{
var I = Input.Where(Objects).ToList();
for (int i = 0; i < I.Count; i++)
{
Input.Remove(I[i]);
}
return Input;
}
public static IEnumerable<T> Update<T>(this DbSet<T> Input, Func<T, Boolean> Objects, Action<T> UpdateAction) where T : class
{
var I = Input.Where(Objects).ToList();
I.ForEach(UpdateAction);
return I;
}
}
Then you can do:
var Context = new EFContext();
Context.YourTable.Remove(x=> x.Id == 1);
Context.SaveChanges();
// OR
Context.Update((x=> x.Id == 1), (y)=> {y.Title = "something"});
Context.SaveChanges();
You could use method calls, and write a ForEach or ForEachWithContinue method that lets you modify each element, but EF wouldn't know what to do with it anyway, and you'd have to use ToList to pull the items out of EF before you could do anything to them.
Example of ForEach (functional purists won't like this of course):
public static void ForEach<T>(this IEnumerable<T> pEnumerable, Action<T> pAction) {
foreach (var item in pEnumerable)
pAction(item);
}
public static IEnumerable<T> ForEachWithContinue<T>(
this IEnumerable<T> pEnumerable,
Action<T> pAction
) {
foreach (var item in pEnumerable)
pAction(item);
return pEnumerable;
}
Then:
EFContext
.SomeTable
.Where(x => x .id == 1)
.ToList() // come out of EF
.ForEach(x => x.Name = "NewName");
EFContext.SaveChanges();
(Actually, List<T> even already has a ForEach method, too, so writing the IEnumerable extensions is not strictly necessary in this case.)
Basically, EF needs to pull the data into memory to know that you have changed anything, to know what your changes are, and to know what to save to back to the DB. I would also consider what it is you're trying to do, where you are overwriting data that neither the user nor the program has even looked at. How did you determine that this was data you wanted to overwrite in the first place?
Also, you can write direct SQL queries straight to the DB as well, using the ExecuteStoreCommand method, which would be the "normal" way of accomplishing this. Something like:
EFContext.ExecuteStoreCommand(
"UPDATE SomeTable SET Name = {0} WHERE ID = {1};",
"NewName",
1
);

Complex object in grid view

I have a gridview, the datasource of which is the following function:
public static List<Train> GetTrainsByIDs(int [] ids) {
using (var context = new MyEntities())
{
return ids.Select(x => context.Trains.Single(y => y.TrainID ==x)).AsQueryable().Include(x=>x.Station).ToList();
}
}
The grid view has an ItemTemplate of <%# Eval("Station.Name") %>.
This causes the error The ObjectContext instance has been disposed and can no longer be used for operations that require a connection despite the fact that I used the include method.
When I change the function to
public static List<Train> GetTrainsByIDs(int [] ids) {
using (var context = new MyEntities())
{
return context.Trains.Where(x => ids.Contains(x.TrainID)).Include(x=>x.Station).ToList();
}
}
it works fine, but then they come out in the wrong order, and also if I have 2 ids the same I would like 2 identical trains in the list.
Is there anything I can do other than create a new viewmodel? Thank you for any help
As for the first query: that's deferred execution.You created an IEnumerable of Trains, noticed that it did not have the Include method, so cast it to IQueryable, added the Include and added the ToList() to prevent lazy loading.
But As per MSDN on DbExtensions.Include:
This extension method calls the Include(String) method of the IQueryable source object, if such a method exists. If the source IQueryable does not have a matching method, then this method does nothing.
(emphasis mine)
The result of the select is an IEnumerable converted to IQueryable, but now implemented by EnumerableQuery which does not implement Include. And nothing happens.
Now the data enters the grid which tries to display the station, which triggers lazy loading while the context is gone.
Apart from that, this design has another flaw: it fires a query for each id separately.
So the second query is much better. It is one query, including the Stations. But now the order is dictated by the order the database pleases to return. You could use Concat to solve this:
IQueryable<Train> qbase = context.Trains.Include(x=>x.Station);
IQueryable<Train> q = null;
foreach (var id in ids)
{
var id1 = id; // Prevent modified closure.
if (q == null)
q = qbase.Where(t => t.Id == id1);
else
q = q.Concat(qbase.Where (t => t.Id == id1));
}
The generated query is not very elegant (to say the least) but after all it is one query as opposed to many.
After reading #Gert Arnold's answer, and getting the idea of doing it in 2 stages, I managed very simply using the first query like this:
using (context = new MyEntities())
{
var trns = context.Trains.Include(x => x.Station);
return ids.Select(x => trns.Single(y => y.TrainID == x)).ToList();
}

Categories

Resources