Here is the situation:
I'm working on a basic search for a somewhat big entity. Right now, the amount of results is manageable but I expect a very large amount of data after a year or two of use, so performance is important here.
The object I'm browsing has a DateTime value and I need to be able to output all objects with the same month, regardless of the year. There are multiple search fields that can be combined, but the other fields do not cause a problem here.
I tried this :
if(model.SelectedMonth != null)
{
contribs = contribs.Where(x => x.Date.Value.Month == model.SelectedMonth);
}
model.Contribs = contribs
.Skip(NBRESULTSPERPAGE*(model.CurrentPage - 1))
.Take(NBRESULTSPERPAGE)
.ToList();
So far all I get is "Invalid 'where' condition. An entity member is invoking an invalid property or method." I thought of just invoking ToList() but it doesn't seem to be very efficient, again the entity is quite big. I'm looking for a clean way to make this work.
You said:
The object I'm browsing has a DateTime value and I need to be able to output all objects with the same month, regardless of the year
...
I expect a very large amount of data after a year or two of use, so performance is important here.
Right there, you have a problem. I understand you are using LINQ to CRM, but this problem would actually come up regardless of what technology you're using.
The underlying problem is that date and time is stored in a single field. The year, month, day, hour, minute, seconds, and fractional seconds are all packed into a single integer that represents the number of units since some time. In the case of a DateTime in .NET, that's the number of ticks since 1/1/0001. If the value is stored in a SQL DateTime2 field, it's the same thing. Other data types have different start dates (epochs) and different precisions. But in all cases, there's just a single number internally.
If you're searching for a value that is in a month of a particular year, then you could get decent performance from a range query. For example, give all values >= 2014-01-01 and < 2014-02-01. Those two points can be mapped back to their numeric representation in the database. If the field has an index, then a range query can use that index.
But if the value you're looking for is just a month, then any query you provide will require the database to extract that month from each and every value in the table. This is also known as a "table scan", and no amount of indexing will help.
A query that can effectively use an index is known as a sargable query. However, the query you are attempting is non-sargable because it has to evaluate every value in the table.
I don't know how much control over the object and its storage you have in CRM, but I will tell you what I usually recommend for people querying a SQL Server directly:
Store the month in a separate column as a single integer. Sometimes this can be a computed column based on the original datetime, such that you never have to enter it directly.
Create an index that includes this column.
Use this column when querying by month, so that the query is sargable.
This is a guess and really should be a comment, but it's too much code to format well in a comment. If it's not helpful I'll delete the answer.
Try moving model.SelectedMonth to a variable rather than putting it in the Where clause
var selectedMonth = model.SelectedMonth;
if(selectedMonth != null)
{
contribs = contribs.Where(x => x.Date.Value.Month == selectedMonth);
}
you might do the same for CurrentPage as well:
int currentPage = model.CurrentPage;
model.Contribs = contribs
.Skip(NBRESULTSPERPAGE*(currentPage - 1))
.Take(NBRESULTSPERPAGE)
.ToList();
Many query providers work better with variables than properties of non-related objects.
What is the type of model.SelectedMonth?
According to your code logic it is nullable, and it appears that it might be a struct, so does this work?
if (model.SelectedMonth.HasValue)
{
contribs = contribs.Where(x => x.Date.Value.Month == model.SelectedMonth.Value);
}
You may need to create a Month OptionSet Attribute on your contrib Entity, which is populated via a plugin on the create/update of the entity for the Date Attribute. Then you could search by a particular month, and rather than searching a Date field, it's searching an int field. This would also make it easy to search for a particular month in the advanced find.
The Linq to CRM Provider isn't a full fledged version of Linq. It generally doesn't support any sort of operation on the attribute in your where statement, since it has to be converted to whatever QueryExpressions support.
Related
I have the following datetime range:
22/07/2021 07:00:52 (start date)
22/07/2021 07:01:00 (end date)
I want all the data between this range including the right hand boundary.
My linq to SQL query:
Select * from tableName where CreatedAt >= startDate and CreateAt <= endDate
The issue here with the query is I am unable to get the record whose CreatedAt= 22/07/2021 07:01:00 (end date). If I do AddMinutes(1) to the endDate, I get the required datas but I also get the data greater than endDate.
For example data having following Created At dates
22/07/2021 07:01:15
22/07/2021 07:01:20
which is faulty.
How do I include the right boundary in the range comparison for datetime?
I have tried removing the miliseconds from the database record as:
select * from Custom_Devices
where
DATEADD(ms, -DATEPART(ms, CreatedAt), CreatedAt) >= '2019-07-29
19:00:00'
and
DATEADD(ms, -DATEPART(ms, CreatedAt), CreatedAt) <=
'2019-07-29 19:01:00'
Attaching a screenshot of the output:
Expected output:
Update: CreatedAt column is a datetime2 column in database.
f I do AddMinutes(1) to the endDate
Not sure why you'd add a whole minute when the problem is caused by additional milliseconds and can be solved by going up a second..
Do you actually want to keep the milliseconds on the dates? Is it any use to you to know that the event occurred at 12:34.56.789 rather than 12:34:56? If not, make the column a datetime2(0) to discard the milliseconds permanently
Otherwise I recommend the route several other people are also advocating and do your LINQ like
context.Whatever.Where(x => x.CreatedAt >= startDate && x.CreatedAt < endDate.AddSeconds(1))
The "less than" is vital. A "less than 12:34:57" will get 12:34:56.999999... which appears to be what you want with your "less than or equal to 12:34:56"
If you're struggling to understand why milliseconds cause a problem, think of dates like numbers - if you have a number 1.222222 and you ask the db for "less than or equal to 1.2" the DB doesn't auto-round the 1.222222 down to 1 decimal place and then go "oh it's equal to 1.2" and return it. It just goes "1.222222 is not less than or equal to 1.2, don't return it"
Over time for question like this you'll always get someone who says "just cast it to.." or "run this to calculate a new date to remove the milliseconds..." - don't; if you write a query that manipulates the table data, that manipulation has to be performed every time the data is queried. It typically kills the ability for the db to use an index on the column too (the db will probably switch to running the manipulation on every value in the index, every time) which means the query is massively more resource intensive.
Always consider manipulating table data in a where clause as an absolute last resort. If there is no other way and the query will be heavily used, look at adding some kind of calculated column to the table and index it so that, conceptually, the result of the manipulation you're carrying out is done once and an index on it can be used
I have a SQL Server 2016 table of events where the event date and start time are stored in separate columns. I'm trying to write a query to identify which of a provided list of DateTimes already exist in the table so I need to add or combine the EventDate and StartTime columns before doing the comparison like this:
public List<DateTime> DoEventsExist(List<DateTime> dateBatch)
{
try
{
DBContext.Events.Where(ce => dateBatch.Contains(ce.EventDate.Add(ce.StartTime));
}
catch (Exception ex)
{
}
}
But this, of course, would give an exception with the message:
LINQ to Entities does not recognize the method 'System.DateTime Add(System.TimeSpan)' method, and this method cannot be translated into a store expression.
In this case the columns are of Date and Time types respectively (so the .NET types are DateTime and TimeSpan), but I'd also like to know how to accomplish this if they were both DateTime2 types where one contained a date at midnight and the other contained an irrelevant date with the correct time.
SqlFunctions.DateAdd would probably work but it would make my code tightly coupled with MS SQL Server, which I don't want.
Note
There is a similar question, however it does not ask about combining two DateTimes. Additionally it is not specific to EF 6, its answer predates EF 6, and the answer does not work so the whole question is useless to SO.
Given you have a list of datetime values, that style of query really isn't going to get you where you want to be.
As you suggest, change it to a TVP.
You can pass a TVP to a procedure as also suggested, but TVPs can also be passed to adhoc SQL queries as parameters.
I have a column in the MS SQL database with type “TIME”. I need to record a negative value in it. It isn't a range between two certain times or dates. It’s an abstract value of time that is used in calculations. In the entity framework it is treated as a TimeSpan, which MetaData.tt does automatically with all TIME defined columns in the database.
For example, I might have an arbitrary calendar with events at 5AM and 8PM on Monday, one at 4PM on Tuesday, and one at 3AM on the Sunday after that. I could add the value to these times and get a time either before (in case of negative time), or after (in case of positive time) the beginning of the event.
When I try to write down a negative value, the database refuses it.
The recording of entity to database goes by direct bind of post attributes in the controller, so if it needs to be translated into ticks, is it reliable to do so in Javascript? How would I go about the input textbox for it? It looks like I cannot separate the value from the content in a text box. And if I have to turn it into an INT, I cannot use #EditorFor on it anymore, creating another point of fault where code becomes less flexible.
It almost feels like I should create new columns to denote the negativity of these values and use a dropdown list with hidden inputs instead of a textbox.
EDIT:
Why avoid non-time types:
Consider this code:
var typeName = _code.Escape(edmType.MetadataProperties.Where(…).FirstOrDefault());
If the EDM property has the type int, the generated code will be the type int.
The EDM property comes from the database itself, so if it is not a type that translates directly into a time, then there will need to be a new method (somewhere in a helper, perhaps), which translates this into a time. This new method will have to be maintained (by other people on the team), which means a weak point, because if someone changes the column name, now the code will not just get properly generated again.
Errors may also not be available through the error log, since most properties also tend to be referenced in javascript at some point (which is often also generated, and now can't be for this column because it is a special case). I'm also talking about some 20 columns suffering from this, so this has a very good potential to quickly turn into a deeply tangled ball of spaghetti.
It really seems like you are trying to store a duration, not a time. TIME is used for storing a point in time, not an amount of time. I would choose some subdivision of time (second, millisecond, etc), and store that as an int (or bigint if necessary). Then in SQL Server you could use DATEADD(SECOND,#storedValue,#dateToChange) to calculate the true time or DateTime.Add.Milliseconds(storedValue) or DateTime.Add.Seconds(storedValue), so on, in C# when trying to calculate the time you want.
Let me know if I'm missing something.
In these cases I think I would store both Begin and End times and have another Computed Column to store the difference (with INT datatype) using DATEDIFF:
CREATE TABLE dbo.MyTable
(
Time1 time
Time2 time
timedifference AS DATEDIFF(Second, Time1,Time2)
);
Then you can convert the Seconds into a time of day like this:
SELECT CONVERT(varchar, DATEADD(ms, timedifference * 1000, 0), 114)
Here is a working sample of what you will get:
SELECT CONVERT(varchar, DATEADD(ms, DATEDIFF(Second, CAST('12:24:18.3500000' as time),CAST('11:24:18.3500000' as time)) * 1000, 0), 114)
SELECT CONVERT(varchar, DATEADD(ms, DATEDIFF(Second, CAST('11:24:18.3500000' as time),CAST('12:24:18.3500000' as time)) * 1000, 0), 114)
Database type time does not support negative values. The acceptable range is 00:00:00.0000000 through 23:59:59.9999999
https://msdn.microsoft.com/en-us/library/bb677243.aspx
I am using the repository pattern to query SQL to get a record based on a date passed in via a web api. The query string is date=2014-09-16, for example. When my repository receives this date, the time is automatically set to 12:00:00 since one wasn't passed. I am only interested in matching the date in the database like this:
public IQueryable<Instance> GetInstances(DateTime date) {
return DBContext.Instances.Where(x => x.StartDateTime == date).OrderBy(x => x.StartDateTime).AsQueryable();
}
The problem is that no match is found since the records in the db are not a full match because of the time portion even though records exist for this date. What is the best way to get a match in the above example?
Here are all the default .NET methods EF supports. Only these are supported. If you want something else, you have to use these date functions: Entity Framework sqlFunctions, or put your code in a stored procedure or table function (which can be called by the context).
I have learned that SQL Server stores DateTime differently than the .NET Framework. This is very unfortunate in the following circumstance: Suppose I have a DataRow filled from my object properties - some of which are DateTime - and a second DataRow filled from data for that object as persisted in SQL Server:
DataRow drFromObj = new DataRow(itemArrayOfObjProps);
DataRow drFromSQL = // blah select from SQL Server
Using the DataRowComparer on these two DataRows will give an unexpected result:
// This gives false about 100% of the time because SQL Server truncated the more precise
// DateTime to the less precise SQL Server DateTime
DataRowComparer.Default.Equals(drFromObj, drFromSQL);
My question was going to be, 'How do other people deal with reality in a safe and sane manner?' I was also going to rule out converting to strings or writing my own DataRowComparer. I was going to offer that, in absence of better advice, I would change the 'set' on all of my DateTime properties to convert to a System.Data.SqlTypes.SqlDateTime and back upon storage thusly:
public Nullable<DateTime> InsertDate
{
get
{
if (_InsDate.HasValue)
return _InsDate;
else
return null;
}
set
{
if (!object.ReferenceEquals(null, value) && value.HasValue)
_InsDate = (DateTime)(new System.Data.SqlTypes.SqlDateTime(value));
}
}
I know full well that this would probably get screwed up as I used the _InsDate variable directly somewhere rather than going through the property. So my other suggestion was going to be simply using System.Data.SqlTypes.SqlDateTime for all properties where I might want a DateTime type to round trip to SQL Server (and, happily, SqlDateTime is nullable). This post changed my mind, however, and seemed to fix my immediate problem. My new question is, 'What are the caveats or real world experiences using the SQL Server datetime2(7) data type rather than the good, old datetime data type?'
TL;DR: Comparing dates is actually hard, even though it looks easy because you get away with it most of the time.
You have to be aware of this issue and round both values yourself to the desired precision.
This is essentially the same issue as comparing floating point numbers. If two times differ by four nanoseconds, does it make sense for your application to consider them different, or the same?
For example, if two servers have logged the same event, searching for corresponding records, you wouldn't say "no that can't be the correct event because the time is wrong by 200 nanoseconds". Clocks can differ by that amount on two servers no matter how hard they try to keep their time synchronised. You might accept that an event seen on server A and logged with a time a couple of seconds after the time on server B might have been actually seen simultaneously or the other way around.
Note:
If you are comparing data which is supposed to have made some sort of round-trip out of the database, you may find it has been truncated to the second or minute. (For example if it has been through Excel or an old VB application, or been written to a file and parsed back in.)
Data originating from external sources is generally rounded to the day, the minute, or the second. (except sometimes logfiles, eventlogs or electronic dataloggers, which may have milliseconds or better)
If the data has come from SQL Server, and you are comparing it back to itself (for example to detect changes), you may not encounter any issues as they will be implicitly truncated to the same precision.
Daylight savings and timezones introduce additional problems.
If searching for dates, use a date range. And make sure you write the query in such a way that any index can be used.
Somewhat related:
Why doesn't this sql query return any results comparing floating point numbers?
Identity increments. Sort by Identity and you get insert order. You (can) control insert order.
I seriously doubt output would ever by out of order but if you don't trust it you can use #SeeMeSorted
DECLARE #SeeMeSort TABLE
( [ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](20) NOT NULL);
DECLARE #SeeMeSorted TABLE
( [ID] [int] primary key NOT NULL,
[Name] [nvarchar](20) NOT NULL);
insert into #SeeMeSort ([Name])
OUTPUT INSERTED.[ID], INSERTED.[name]
values ('fff'), ('hhh'), ('ggg');
insert into #SeeMeSort ([Name])
OUTPUT INSERTED.[ID], INSERTED.[name]
into #SeeMeSorted
values ('xxx'), ('aaa'), ('ddd');
select * from #SeeMeSorted order by [ID];