.NET DataRowComparer DateTime Comparison To T-SQL DateTime - c#

I have learned that SQL Server stores DateTime differently than the .NET Framework. This is very unfortunate in the following circumstance: Suppose I have a DataRow filled from my object properties - some of which are DateTime - and a second DataRow filled from data for that object as persisted in SQL Server:
DataRow drFromObj = new DataRow(itemArrayOfObjProps);
DataRow drFromSQL = // blah select from SQL Server
Using the DataRowComparer on these two DataRows will give an unexpected result:
// This gives false about 100% of the time because SQL Server truncated the more precise
// DateTime to the less precise SQL Server DateTime
DataRowComparer.Default.Equals(drFromObj, drFromSQL);
My question was going to be, 'How do other people deal with reality in a safe and sane manner?' I was also going to rule out converting to strings or writing my own DataRowComparer. I was going to offer that, in absence of better advice, I would change the 'set' on all of my DateTime properties to convert to a System.Data.SqlTypes.SqlDateTime and back upon storage thusly:
public Nullable<DateTime> InsertDate
{
get
{
if (_InsDate.HasValue)
return _InsDate;
else
return null;
}
set
{
if (!object.ReferenceEquals(null, value) && value.HasValue)
_InsDate = (DateTime)(new System.Data.SqlTypes.SqlDateTime(value));
}
}
I know full well that this would probably get screwed up as I used the _InsDate variable directly somewhere rather than going through the property. So my other suggestion was going to be simply using System.Data.SqlTypes.SqlDateTime for all properties where I might want a DateTime type to round trip to SQL Server (and, happily, SqlDateTime is nullable). This post changed my mind, however, and seemed to fix my immediate problem. My new question is, 'What are the caveats or real world experiences using the SQL Server datetime2(7) data type rather than the good, old datetime data type?'

TL;DR: Comparing dates is actually hard, even though it looks easy because you get away with it most of the time.
You have to be aware of this issue and round both values yourself to the desired precision.
This is essentially the same issue as comparing floating point numbers. If two times differ by four nanoseconds, does it make sense for your application to consider them different, or the same?
For example, if two servers have logged the same event, searching for corresponding records, you wouldn't say "no that can't be the correct event because the time is wrong by 200 nanoseconds". Clocks can differ by that amount on two servers no matter how hard they try to keep their time synchronised. You might accept that an event seen on server A and logged with a time a couple of seconds after the time on server B might have been actually seen simultaneously or the other way around.
Note:
If you are comparing data which is supposed to have made some sort of round-trip out of the database, you may find it has been truncated to the second or minute. (For example if it has been through Excel or an old VB application, or been written to a file and parsed back in.)
Data originating from external sources is generally rounded to the day, the minute, or the second. (except sometimes logfiles, eventlogs or electronic dataloggers, which may have milliseconds or better)
If the data has come from SQL Server, and you are comparing it back to itself (for example to detect changes), you may not encounter any issues as they will be implicitly truncated to the same precision.
Daylight savings and timezones introduce additional problems.
If searching for dates, use a date range. And make sure you write the query in such a way that any index can be used.
Somewhat related:
Why doesn't this sql query return any results comparing floating point numbers?

Identity increments. Sort by Identity and you get insert order. You (can) control insert order.
I seriously doubt output would ever by out of order but if you don't trust it you can use #SeeMeSorted
DECLARE #SeeMeSort TABLE
( [ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](20) NOT NULL);
DECLARE #SeeMeSorted TABLE
( [ID] [int] primary key NOT NULL,
[Name] [nvarchar](20) NOT NULL);
insert into #SeeMeSort ([Name])
OUTPUT INSERTED.[ID], INSERTED.[name]
values ('fff'), ('hhh'), ('ggg');
insert into #SeeMeSort ([Name])
OUTPUT INSERTED.[ID], INSERTED.[name]
into #SeeMeSorted
values ('xxx'), ('aaa'), ('ddd');
select * from #SeeMeSorted order by [ID];

Related

Store values in separate, C# type-specific columns or all in one column?

I'm building a C# project configuration system that will store configuration values in a SQL Server db.
I was originally going to set the table up as such:
KeyId int
FieldName varchar
DataType varchar
StringValue varchar
IntValue int
DecimalValue decimal
...
Values would be stored and retrieved with the value in the DataType column determining which Value column to use, but I really don't like that design. So I thought I'd go this route:
KeyId int
FieldName varchar
DataType varchar
Value varbinary
Here the value in DataType would still determine the type of Value brought back, but it would all be in one column and I wouldn't have to write a ton of overloads to accommodate the different types like I would have with the previous solution. I would just pull the Value in as a byte array and use DataType to perform whatever conversion(s) necessary to get my Value.
Is the varbinary approach going to cause any performance issues or is it just bad practice to drop all these different types of data into a varbinary? I've been searching around for about an hour and I can't get to a definitive answer.
Also, if there is a more preferred method anyone can think of to reach the same conclusion, I'm all ears (or eyes).
You could serialize your settings as JSON and just store that as a string. Then you have all the settings within one row and your clients can deserialize as needed. This is also a safe way to add additional settings at any time without any modifications to your database.
We are using the second solution and it works well. Remember, that the disk access is in orders of magnitude greater, than the ex. casting operation (it's milliseconds vs. nanoseconds, see ref), so do not look for bottleneck here.
The solution can be to implement polymorphic association (1, 2). But I dont think there is a need for that, or that you should do this. The second solution is close to non-Sql db - you can dump as a value anything, might be as well entire html markup for a page. It should be the caller responsability to know what to do wit the data.
Also, see threads on how to store settings in DB: 1, 2 and 3 for critique.

MS SQL how to record negative time

I have a column in the MS SQL database with type “TIME”. I need to record a negative value in it. It isn't a range between two certain times or dates. It’s an abstract value of time that is used in calculations. In the entity framework it is treated as a TimeSpan, which MetaData.tt does automatically with all TIME defined columns in the database.
For example, I might have an arbitrary calendar with events at 5AM and 8PM on Monday, one at 4PM on Tuesday, and one at 3AM on the Sunday after that. I could add the value to these times and get a time either before (in case of negative time), or after (in case of positive time) the beginning of the event.
When I try to write down a negative value, the database refuses it.
The recording of entity to database goes by direct bind of post attributes in the controller, so if it needs to be translated into ticks, is it reliable to do so in Javascript? How would I go about the input textbox for it? It looks like I cannot separate the value from the content in a text box. And if I have to turn it into an INT, I cannot use #EditorFor on it anymore, creating another point of fault where code becomes less flexible.
It almost feels like I should create new columns to denote the negativity of these values and use a dropdown list with hidden inputs instead of a textbox.
EDIT:
Why avoid non-time types:
Consider this code:
var typeName = _code.Escape(edmType.MetadataProperties.Where(…).FirstOrDefault());
If the EDM property has the type int, the generated code will be the type int.
The EDM property comes from the database itself, so if it is not a type that translates directly into a time, then there will need to be a new method (somewhere in a helper, perhaps), which translates this into a time. This new method will have to be maintained (by other people on the team), which means a weak point, because if someone changes the column name, now the code will not just get properly generated again.
Errors may also not be available through the error log, since most properties also tend to be referenced in javascript at some point (which is often also generated, and now can't be for this column because it is a special case). I'm also talking about some 20 columns suffering from this, so this has a very good potential to quickly turn into a deeply tangled ball of spaghetti.
It really seems like you are trying to store a duration, not a time. TIME is used for storing a point in time, not an amount of time. I would choose some subdivision of time (second, millisecond, etc), and store that as an int (or bigint if necessary). Then in SQL Server you could use DATEADD(SECOND,#storedValue,#dateToChange) to calculate the true time or DateTime.Add.Milliseconds(storedValue) or DateTime.Add.Seconds(storedValue), so on, in C# when trying to calculate the time you want.
Let me know if I'm missing something.
In these cases I think I would store both Begin and End times and have another Computed Column to store the difference (with INT datatype) using DATEDIFF:
CREATE TABLE dbo.MyTable
(
Time1 time
Time2 time
timedifference AS DATEDIFF(Second, Time1,Time2)
);
Then you can convert the Seconds into a time of day like this:
SELECT CONVERT(varchar, DATEADD(ms, timedifference * 1000, 0), 114)
Here is a working sample of what you will get:
SELECT CONVERT(varchar, DATEADD(ms, DATEDIFF(Second, CAST('12:24:18.3500000' as time),CAST('11:24:18.3500000' as time)) * 1000, 0), 114)
SELECT CONVERT(varchar, DATEADD(ms, DATEDIFF(Second, CAST('11:24:18.3500000' as time),CAST('12:24:18.3500000' as time)) * 1000, 0), 114)
Database type time does not support negative values. The acceptable range is 00:00:00.0000000 through 23:59:59.9999999
https://msdn.microsoft.com/en-us/library/bb677243.aspx

.Where clause on DateTime.Month

Here is the situation:
I'm working on a basic search for a somewhat big entity. Right now, the amount of results is manageable but I expect a very large amount of data after a year or two of use, so performance is important here.
The object I'm browsing has a DateTime value and I need to be able to output all objects with the same month, regardless of the year. There are multiple search fields that can be combined, but the other fields do not cause a problem here.
I tried this :
if(model.SelectedMonth != null)
{
contribs = contribs.Where(x => x.Date.Value.Month == model.SelectedMonth);
}
model.Contribs = contribs
.Skip(NBRESULTSPERPAGE*(model.CurrentPage - 1))
.Take(NBRESULTSPERPAGE)
.ToList();
So far all I get is "Invalid 'where' condition. An entity member is invoking an invalid property or method." I thought of just invoking ToList() but it doesn't seem to be very efficient, again the entity is quite big. I'm looking for a clean way to make this work.
You said:
The object I'm browsing has a DateTime value and I need to be able to output all objects with the same month, regardless of the year
...
I expect a very large amount of data after a year or two of use, so performance is important here.
Right there, you have a problem. I understand you are using LINQ to CRM, but this problem would actually come up regardless of what technology you're using.
The underlying problem is that date and time is stored in a single field. The year, month, day, hour, minute, seconds, and fractional seconds are all packed into a single integer that represents the number of units since some time. In the case of a DateTime in .NET, that's the number of ticks since 1/1/0001. If the value is stored in a SQL DateTime2 field, it's the same thing. Other data types have different start dates (epochs) and different precisions. But in all cases, there's just a single number internally.
If you're searching for a value that is in a month of a particular year, then you could get decent performance from a range query. For example, give all values >= 2014-01-01 and < 2014-02-01. Those two points can be mapped back to their numeric representation in the database. If the field has an index, then a range query can use that index.
But if the value you're looking for is just a month, then any query you provide will require the database to extract that month from each and every value in the table. This is also known as a "table scan", and no amount of indexing will help.
A query that can effectively use an index is known as a sargable query. However, the query you are attempting is non-sargable because it has to evaluate every value in the table.
I don't know how much control over the object and its storage you have in CRM, but I will tell you what I usually recommend for people querying a SQL Server directly:
Store the month in a separate column as a single integer. Sometimes this can be a computed column based on the original datetime, such that you never have to enter it directly.
Create an index that includes this column.
Use this column when querying by month, so that the query is sargable.
This is a guess and really should be a comment, but it's too much code to format well in a comment. If it's not helpful I'll delete the answer.
Try moving model.SelectedMonth to a variable rather than putting it in the Where clause
var selectedMonth = model.SelectedMonth;
if(selectedMonth != null)
{
contribs = contribs.Where(x => x.Date.Value.Month == selectedMonth);
}
you might do the same for CurrentPage as well:
int currentPage = model.CurrentPage;
model.Contribs = contribs
.Skip(NBRESULTSPERPAGE*(currentPage - 1))
.Take(NBRESULTSPERPAGE)
.ToList();
Many query providers work better with variables than properties of non-related objects.
What is the type of model.SelectedMonth?
According to your code logic it is nullable, and it appears that it might be a struct, so does this work?
if (model.SelectedMonth.HasValue)
{
contribs = contribs.Where(x => x.Date.Value.Month == model.SelectedMonth.Value);
}
You may need to create a Month OptionSet Attribute on your contrib Entity, which is populated via a plugin on the create/update of the entity for the Date Attribute. Then you could search by a particular month, and rather than searching a Date field, it's searching an int field. This would also make it easy to search for a particular month in the advanced find.
The Linq to CRM Provider isn't a full fledged version of Linq. It generally doesn't support any sort of operation on the attribute in your where statement, since it has to be converted to whatever QueryExpressions support.

Query problem - rows are returned when query is run in sql navigator , but not in my c# program

Update:
This is the query from the debugger, which was retrieved from a string builder:
{SELECT * FROM FCR.V_REPORT WHERE DATE BETWEEN to_date('14/09/2001' , 'dd/mm/yyyy') AND to_date('30/09/2011' , 'dd/mm/yyyy')}
If you remove the curly brackets and post it in Navigator, it works.
Original:
I have a problem when running my program. The query in sql navigator returns 192 rows but when I run the query on c#(visual studio 2010) the query returns 0 rows.
Below is my c# code:
public static DataTable GetReport(string date1, string date2)
{
DatabaseAdapter dba = DatabaseAdapter.GetInstance();
string SqlQuery =
string.Format(#"SELECT *
FROM FCR.V_REPORT
WHERE DATE BETWEEN to_date('{0}' , 'dd/mm/yyyy')
AND to_date('{1}' , 'dd/mm/yyyy')", date1, date2);
OracleDataReader reader = dba.QueryDatabase(SqlQuery);
DataTable dt = new DataTable();
dt.Load(reader);
int temp = dt.Rows.Count;
return dt;
}
This is the query I am using in sql navigator(which returns 192 rows):
SELECT *
FROM FCR.V_REPORT
WHERE DATE BETWEEN to_date('01/01/2001' , 'dd/mm/yyyy')
AND to_date('30/09/2011' , 'dd/mm/yyyy')
I bet you that the dates passed in from your c# program are different because your sql statement is identical. Put a break point and verify that the dates are exactly the same. Also verify that date1 and date2 are passed in in the appropriate order.
Try dropping the view and create again. Make sure you got the aliases correct too.
When addressing this kind of problem it's usually best to not make too many assumptions in regard to where the error is originating from. If you have access to the box I would run a trace and make sure that the statements being run are indeed identical. If they are identical then you know you have a programmatic error on the receiving or processing side of your application.
You can modify your statement to insert the results into a temporary table and verify that the 192 rows you expect are there. If you are executing the exact same statement and the temp table shows that you are getting the results you expect you've further narrowed down the problem and can begin looking for application errors.
Assuming that your C# code is sending the correct query to the database, there are several ways for Oracle to run the same query differently depending on the session. You may need to get a DBA involved to figure this out (e.g. by looking at the actual executed statement in v$sql), or at least to rule out these weird cases.
NLS_DATE_FORMAT
If the DATE column is stored as a string, there would be an implicit conversion to a date. If SQL Navigator uses a different NLS_DATE_FORMAT than C# the conversion could create different dates. Although if they were different there's a good chance you'd get an error, not just 0 rows.
Virtual Private Database
VPD can add a predicate to every query, possibly using different session information. For example if the program that created the session is like '%Navigator%' it could add 'where 1 = 0' to every query. (I know this sounds crazy, but I've seen something very similar.)
Advanced Query Rewrite
This is meant for materialized views. But some people use it for performance fixes, sort of like stored outlines. And some evil people use it to rewrite your query into a completely different query. Different session settings, such as QUERY_REWRITE_INTEGRITY and CURSOR_SHARING, could explain why the query works on one session but not another. Speaking of CURSOR_SHARING, that may lead to some rare problems if it is set to SIMILAR or FORCE.
I have run into the same situation quite often with other development tools and SQL Server. In my case I found that in the query tool I would have two outputs, the records in a datagrid and the 'rows affected' message in a results pane. However, in my development IDE I would see no data (unless I checked for additional datasets). For some reason it would return the rows affected as the 1st result set. When I turned off the rowcount option in the query itself, then the data showed up in both places.
You could also use a protocol analyzer (e.g., Ethereal) and capture the TCP/IP traffic to verify that the requests are identical over wire. That has helped me in a pinch too.
Best of luck.
jl
I'm wondering is this is the same problem as your question here.
Your string.Format call is not specifying the format of your date1 and date2 values in the SQL string itself. Hence it is using the default DateTime.ToString() of date1 and date2, which could be something like '16/09/2011 12:23:34' and so which does not match the format specified in your to_date statement.
Try this:
string SqlQuery = string.Format(#"SELECT * FROM V_REPORT WHERE DATE BETWEEN
to_date('{0:dd/MM/yyyy}' , 'dd/mm/yyyy')
AND
to_date('{1:dd/MM/yyyy}' , 'dd/mm/yyyy')",
date1,
date2);

SQL SUM Multiple Columns in a table - Performance

Using: SQL Server 2008, Entity Framework
I am summing columns in a table across a date/time range. The table is straight-forward:
DataId bigint IDENTITY(1,1) NOT NULL PRIMARY KEY,
DateCollected datetime NOT NULL,
Length int NULL,
Width int NULL,
-- etc... (several more measurement variables)
Once I have the date/time range, I use linq-to-EF to get the query back
var query = _context.Data.Where(d =>
(d.DateCollected > df &&
d.DateCollected < dt))
I then construct my data structure using the sum of the data elements I’m interested in
DataRowSum row = new DataRowSum
{
Length_Sum = query.Sum(d => d.Length.HasValue ? (long)d.Length.Value : 0),
Width_Sum = query.Sum(d => d.Width.HasValue ? (long)d.Width.Value : 0),
// etc... (sum up the rest of the measurement variables)
}
While this works, it results in a lot of DB round trips and is quite slow. Is there a better way to do this? If it means doing it in a stored procedure, that’s fine with me. I just need to get the performance better since we’ll only be adding more measurement variables and this performance issue will just get worse.
SQL SERVER is very good at rolling up summary values. Create a proper stored procedure which calculates the sums for you already. This will give you maximum performance, especially if you don't actually need the tabular data in your client program. Just have SQL Server roll up the summary, and send back a whole lot less data. One of the reasons I generally don't like LINQ is because it tempts the programmers to do things like what you are trying to do (pull a set and do 'something' against every row), instead of taking advantage of the database engine and all its capabilities.
Do this with aggregate functions and grouping in the SQL. LINQ will never figure out how to do this fast.

Categories

Resources