how to handle source data changes in LINQ? - c#

i have a collection of items where each item has a "date" field (the code is below).
i am trying to fill in any gaps in dates in the collection using LINQ. in particular, i want the resulting sequence to contain all days between the first and the last day in the original sequence.
in addition to this, my resulting LINQ query should be able to handle any modifications of the original sequence. that is i cannot calculate the minimal and the maximal dates ahead of time.
so i tried the code below but it fails when it tries to calculate Min and Max of the sequence. i am looking for a "lazy" alternative.
thanks for any help
konstantin
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Linq;
namespace consapp
{
class C
{
public DateTime date;
public int? value;
}
static class Program
{
static IEnumerable<C> dates(DateTime d0, DateTime d1)
{
for (var d = d0; d <= d1; d = d.AddDays(1))
{
yield return new C { date = d };
}
}
static void Main(string[] args)
{
var xs = new ObservableCollection<C>();
var q = from d in dates(xs.Min(y => y.date), xs.Max(y => y.date))
join x in xs on d.date equals x.date into js
from j in js.DefaultIfEmpty()
orderby d.date
select new { date = d.date, value = j != null ? j.value : null };
xs.Add(new C { date = DateTime.Parse("11/10/11") });
xs.Add(new C { date = DateTime.Parse("02/02/11") });
xs.Add(new C { date = DateTime.Parse("11/24/11") });
xs.Add(new C { date = DateTime.Parse("09/09/11") });
xs.Add(new C { date = DateTime.Parse("11/10/11") });
foreach (var x in q)
{
Console.WriteLine(x.date.ToShortDateString());
}
}
}
}

I'm not absolutely positive, but:
var q = from d in dates(xs.Min(y => y.date), xs.Max(y => y.date))
I believe that the "dates" method will be called immediately, and the rest of the LINQ query (including the iterator from dates() itself) will be built up around the result from that method. So you are going to have to pre-populate xs with the data you are interested in.
This is because LINQ essentially works by wrapping enumerables in other enumerables. In order for it to do this, it must start with an enumerable. In order to do that, it must call your order() method, which requires supplying its arguments immediately, so that it can receive the enumerable object that it will be wrapping in other enumerables. So the xs.Min and xs.Max methods will be called when that line of code is reached, but nothing else in the query will actually be processed.
A workaround would be to have your dates() method actually receive the ObservableCollection and call Min/Max itself. Because this will happen in the generated iterator, execution of those calls will be deferred as you expect.

Standard LINQ implementation based on IEnumerable<T> cannot handle data sources that are changing such as ObservableCollection. The reason why your example fails is that it will try to evaluate the data source (and call the dates function and Min and Max operators) when you define the query (but the data source doesn't contain any data at that point).
One option is to use an alternative LINQ implementation that works with ObservableCollection and can automatically update the result when the source changes. As far as I know Bindable LINQ project should be able to do that.
Another (simpler) option is to turn your query into a method and call the method repeatedly (to update the result) when you know that the data source has changed. You'd have to make ObservableCollection a private field and the method would simply run using the data currently stored in the collection:
private ObservableCollection source;
void UpdateResults() {
var q = /* The query goes here */
// Do something with the result of the query
}

Related

LINQ querying using an entity object not in the database - Error: LINQ to Entities does not recognize the method

I have an Entity Framework v5 model created from a database. The table Season has a corresponding entity called Season. I need to calculate the Season's minimum start date and maximum end date for each year for a Project_Group. I then need to be able to JOIN those yearly min/max Season values in other LINQ queries. To do so, I have created a SeasonLimits class in my Data Access Layer project. (A SeasonLimits table does not exist in the database.)
public partial class SeasonLimits : EntityObject
{
public int Year { get; set; }
public DateTime Min_Start_Date { get; set; }
public int Min_Start_Date_ID { get; set; }
public DateTime Max_End_Date { get; set; }
public int Max_End_Date_ID { get; set; }
public static IQueryable<SeasonLimits> QuerySeasonLimits(MyEntities context, int project_Group_ID)
{
return context
.Season
.Where(s => s.Locations.Project.Project_Group.Any(pg => pg.Project_Group_ID == project_Group_ID))
.GroupBy(x => x.Year)
.Select(sl => new SeasonLimits
{
Year = sl.Key,
Min_Start_Date = sl.Min(d => d.Start_Date),
Min_Start_Date_ID = sl.Min(d => d.Start_Date_ID),
Max_End_Date = sl.Max(d => d.End_Date),
Max_End_Date_ID = sl.Max(d => d.End_Date_ID)
});
}
}
// MVC Project
var seasonHoursByYear =
from d in context.AuxiliaryDateHours
from sl in SeasonLimits.QuerySeasonLimits(context, pg.Project_Group_ID)
where d.Date_ID >= sl.Min_Start_Date_ID
&& d.Date_ID < sl.Max_End_Date_ID
group d by new
{
d.Year
} into grp4
orderby grp4.Key.Year
select new
{
Year = grp4.Key.Year,
HoursInYear = grp4.Count()
};
In my MVC project, whenever I attempt to use the QuerySeasonLimits method in a LINQ query JOIN, I receive the message,
"LINQ to Entities does not recognize the method
'System.Linq.IQueryable`1[MyDAL.SeasonLimits]
QuerySeasonLimits(MyDAL.MyEntities, MyDAL.Project_Group)' method, and
this method cannot be translated into a store expression."
Is this error being generated because SeasonLimits is not an entity that exists in the database? If this can't be done this way, is there another way to reference the logic so that it can be used in other LINQ queries?
EF is trying to translate your query to SQL and as there is no direct mapping between your method and the generated SQL you're getting the error.
First option would be not to use the method and instead write the contents of the method directly in the original query (I'm not sure at the moment if this would work, as I don't have a VS running). In the case this would work, you'll most likely end up with a very complicated SQL with a poor performance.
So here comes the second option: don't be afraid to use multiple queries to get what you need. Sometimes it also makes sense to send a simpler query to the DB and continue with modifications (aggregation, selection, ...) in the C# code. The query gets translated to SQL everytime you try to enumerate over it or if you use one of the ToList, ToDictionary, ToArray, ToLookup methods or if you're using a First, FirstOrDefault, Single or SingleOrDefault calls (see the LINQ documentation for the specifics).
One possible example that could fix your query (but most likely is not the best solution) is to start your query with:
var seasonHoursByYear =
from d in context.AuxiliaryDateHours.ToList()
[...]
and continue with all the rest. This minor change has fundamental impact:
by calling ToList the DB will be immediately queried and the whole
AuxiliaryDateHours table will be loaded into the application (this will be a performance problem if the table has too many rows)
a second query will be generated when calling your QuerySeasonLimits method (you could/should also include a ToList call for that)
the rest of the seasonHoursByYear query: where, grouping, ... will happen in memory
There are a couple of other points that might be unrelated at this point.
I haven't really investigated the intent of your code - as this could lead to further optimizations - even total reworks that could bring you more gains in the end...
I eliminated the SeasonLimits object and the QuerySeasonLimits method, and wrote the contents of the method directly in the original query.
// MVC Project
var seasonLimits =
from s in context.Season
.Where(s => s.Locations.Project.Project_Group.Any(pg => pg.Project_Group_ID == Project_Group_ID))
group s by new
{
s.Year
} into grp
select new
{
grp.Key.Year,
Min_Start_Date = grp.Min(x => x.Start_Date),
Min_Start_Date_ID = grp.Min(x => x.Start_Date_ID),
Max_End_Date = grp.Max(x => x.End_Date),
Max_End_Date_ID = grp.Max(x => x.End_Date_ID)
};
var seasonHoursByYear =
from d in context.AuxiliaryDateHours
from sl in seasonLimits
where d.Date_ID >= sl.Min_Start_Date_ID
&& d.Date_ID < sl.Max_End_Date_ID
group d by new
{
d.Year
} into grp4
orderby grp4.Key.Year
select new
{
Year = grp4.Key.Year,
HoursInYear = grp4.Count()
};

Is a statement recalculated in every iteration when used in LINQ?

For instance
myEnumerable.Where(v => v != myDictionary["someKey"])
when this query is called is myDictionary["someKey"] statement executed (meaning that dictionary is queried for the key) or the result of myDictionary["someKey"]
is used after the first iteration?
The result of myDictionary["someKey"] will not be cached(*see edit below), it will be accessed on every item of myEnumerable. However, you can still cache it manually :
var someValue = myDictionary["someKey"];
myEnumerable.Where(v => v != someValue)
Also take note that, if you plan to iterate/access that IEnumerable multiple time, it is best to actualize it via ToList(). Or, the execution will be deferred every single time.
var query = myEnumerable.Where(v => v != myDictionary["someKey"]);
foreach (var item in query) { /* ... */}
foreach (var item in query) { /* ... */}
In the above example, the Where clause is executed twice.
EDIT: As #LucasTrzesniewski has pointed out, this is only stands true for LINQ-to-Objects. This is because the query is evaluated in memory. However, for LINQ-to-Entities, it gets a little bit different, as the query will be converted into SQL query and then executed in the database in order to avoid round trips.
Here's a really simple demo (and please, don't try this at home):
var myDictionary = new Dictionary<string,string>() { { "someKey", "someValue" } };
var myEnumerable = new List<string> { "someValue", "someOtherValue" };
var test = myEnumerable.Where(v => v == myDictionary["someKey"]);
foreach (var t in test)
{
Console.WriteLine(t);
myDictionary["someKey"] = "someOtherValue";
}
If myDictionary["someKey"] was only evaulated once, then changing the value of myDictionary["someKey"] wouldn't change anything. But if you run the code, you will see that it will echo both someValue and someOtherValue. If you comment out the line that changes the dictionary value, then you will only see someValue
As #Lucas Trzesniewski points out in the comments to the other answer, this applies to LINQ-to-objects. There are a number of important differences between LINQ-to-objects and LINQ-to-SQL.
The Lambda expression you supply to the Linq Where extension is simply a Func<> delegate. The method is executed for each item in the IEnumerable(of T), receiving the current item as a parameter. It doesn't do anything special other than that. Your code is somewhat similar similar to:
var myTempCollection = new List<MyClass>();
foreach(MyClass item in myEnumerable)
{
if (item != myDictionary["someKey"])
{
myTempCollection.Add(item);
}
}
var result = myTempCollection;
It depends on the QueryProvider implementation. For example, the ObjectQueryProvider used by Linq-to-objects will access it on every iteration. For Linq-to-entities, it will access it once and then send that value to the database server.

Return Timespan ticks difference in LINQ queries

I'm querying a list of objects, and I want to return a TimeSpan's ticks list of the difference between the time registered in the object, and now.
I wanted all in one expression, so:
var list = (from r in res where r.Site == "CNIS"
select Math.Abs((r.Timestamp.Value - DateTime.Now).Ticks)).ToList();
But I get the following error:
Exception Details: DbArithmeticExpression arguments must have a numeric common type
I already managed to do a workaround. For example, my code looks like this now:
var list = new List<long>();
foreach(var item in (from r in res where r.Site == "CNIS" select r))
list.Add(Math.Abs((item.Timestamp.Value - DateTime.Now).Ticks));
But what I really wanted to know is if it is possible to get the Timespan diff from a DateTime value to now, in a single LINQ query
It seems the error is relevant to the translation of your select statement into SQL.If fecthing the results form DB is not a problem you can do it using AsEnumerable and then project the items:
var now = DateTime.Now;
var list = res.Where(r => r.Site == "CNIS")
.AsEnumerable()
.Select(x => Math.Abs((x.Timestamp.Value - now).Ticks))
.ToList();
And since the value of DateTime.Now changes you should probably store the it into a variable and use it in the calculation.

Return best fit item from collection in C# 3.5 in just a line or two

Here is some sample code I have basically written thousands of times in my life:
// find bestest thingy
Thing bestThing;
float bestGoodness = FLOAT_MIN;
foreach( Thing x in arrayOfThings )
{
float goodness = somefunction( x.property, localvariable );
if( goodness > bestGoodness )
{
bestGoodness = goodness;
bestThing = x;
}
}
return bestThing;
And it seems to me C# should already have something that does this in just a line. Something like:
return arrayOfThings.Max( delegate(x)
{ return somefunction( x.property, localvariable ); });
But that doesn't return the thing (or an index to the thing, which would be fine), that returns the goodness-of-fit value.
So maybe something like:
var sortedByGoodness = from x in arrayOfThings
orderby somefunction( x.property, localvariable ) ascending
select x;
return x.first;
But that's doing a whole sort of the entire array and could be too slow.
Does this exist?
This is what you can do using System.Linq:
var value = arrayOfThings
.OrderByDescending(x => somefunction(x.property, localvariable))
.First();
If the array can be empty, use .FirstOrDefault(); to avoid exceptions.
You really don't know how this is implemented internally, so you can't assure this will sort the whole array to get the first element. For example, if it was linq to sql, the server would receive a query including the sort and the condition. It wouldn't get the array, then sort it, then get the first element.
In fact, until you don't call First, the first part of the query isn't evaluated. I mean this isn't a two steps evaluation, but a one step evaluation.
var sortedValues =arrayOfThings
.OrderByDescending(x => somefunction(x.property, localvariable));
// values isn't still evaluated
var value = sortedvalues.First();
// the whole expression is evaluated at this point.
I don't think this is possible in standard LINQ without sorting the enuermable (which is slow in the general case), but you can use the MaxBy() method from the MoreLinq library to achieve this. I always include this library in my projects as it is so useful.
http://code.google.com/p/morelinq/source/browse/trunk/MoreLinq/MaxBy.cs
(The code actually looks very similar to what you have, but generalized.)
I would implement IComparable<Thing> and just use arrayOfThings.Max().
Example here:
http://msdn.microsoft.com/en-us/library/bb347632.aspx
I think this is the cleanest approach and IComparable may be of use in other places.
UPDATE
There is also an overloaded Max method that takes a projection function, so you can provide different logic for obtaining height, age, etc.
http://msdn.microsoft.com/en-us/library/bb534962.aspx
I followed the link Porges listed in the comment, How to use LINQ to select object with minimum or maximum property value and ran the following code in LINQPad and verified that both LINQ expressions returned the correct answers.
void Main()
{
var things = new Thing [] {
new Thing { Value = 100 },
new Thing { Value = 22 },
new Thing { Value = 10 },
new Thing { Value = 303 },
new Thing { Value = 223}
};
var query1 = (from t in things
orderby GetGoodness(t) descending
select t).First();
var query2 = things.Aggregate((curMax, x) =>
(curMax == null || (GetGoodness(x) > GetGoodness(curMax)) ? x : curMax));
}
int GetGoodness(Thing thing)
{
return thing.Value * 2;
}
public class Thing
{
public int Value {get; set;}
}
Result from LinqPad

Is this the correct behavior of a ObservableCollection and Linq?

Hi I just ran into a sync problem, and have replicated it in this small example.
class MyClass
{
public int Number { get; set; }
}
static void Main(string[] args)
{
var list = new ObservableCollection<MyClass>
{
new MyClass() {Number = 1},
new MyClass() {Number = 2},
new MyClass() {Number = 3}
};
var count = from i in list where i.Number == 1 select i;
Console.WriteLine("Found {0}", count.Count());
list[2].Number = 1;
Console.WriteLine("Found {0}", count.Count());
}
This will output
Found 1
Found 2
This is not what I expected, would have guessed it would return 1 both times.
It there anyway to avoid this action and still use a observable collection?
I'm trying to implement a method to reorder, but this makes it hard to select the correct item.
UPDATE
An easy solution would of cource be to modify it like this
int found = count.Count();
Console.WriteLine("Found {0}", found);
list[2].Number = 1;
Console.WriteLine("Found {0}", found);
This is due to the lazy evaluation of your LINQ query and has nothing to do with the ObservableCollection. If you change your LINQ query to the following line:
(from i in list where i.Number == 1 select i).ToList();
you will see the behavior you expect.
The ToList() addition makes sure your LINQ query is evaluated at that moment. Otherwise, it is evaluated only when necessary. Because you call Count() twice, the query is evaluated twice but on different data.
You encountered one of the pitfalls of LINQ. The variable count in your example isn't the result of a query, it is the query. Every time you change something in the underlying collection, the change will be reflected in subsequent calls.

Categories

Resources