Best Approach to audit changes in the table

Best Approach to audit changes in the table - c#

I'm designing a .NET application where user can copy, edit, update or delete a record from a DataGrid element that is getting data from a table.
In the design I need to be able to maintain the versions to check all the changes that were made to the table by users. Can someone suggest what would be the best way to implement this requirement.
Thanks a lot.

As mentioned by Sean Lange, this is a complex topic and there is no silver bullet.
Change Data Capture is designed explicitly for this problem.
If you don't want to enable CDC, then triggers are usually your best bet.
Here is an example of an auditing trigger utilizing rowversion\timestamp:
CREATE TABLE [dbo].[Table1](
[Id] [int] NOT NULL,
[Data] [nvarchar](50) NULL,
[Version] [timestamp] NOT NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Table1_History](
[Id] [int] NOT NULL,
[Data] [nvarchar](50) NULL,
[Version] [binary](8) NOT NULL,
[ModDate] [datetimeoffset](7) NOT NULL,
[ModUser] [nvarchar](50) NOT NULL,
[Operation] CHAR(1) NOT NULL
) ON [PRIMARY]
GO
CREATE TRIGGER [dbo].[trgTable1_History]
ON [dbo].[Table1]
AFTER INSERT,DELETE,UPDATE
AS
BEGIN
SET NOCOUNT ON;
DECLARE #now DATETIMEOFFSET(7) = SYSDATETIMEOFFSET()
INSERT INTO dbo.Table1_History
(Id, Data, Version, ModDate, ModUser, Operation)
SELECT Id, Data, Version, #now, SYSTEM_USER, 'I' from inserted
INSERT INTO dbo.Table1_History
(Id, Data, Version, ModDate, ModUser, Operation)
SELECT Id, Data, Version, #now, SYSTEM_USER, 'D' from deleted
END
The timestamp column will automatically update upon every change to the row.
This provides you current + history in your audit table(which simplifies reporting). The Version column gives you an easy lookup between Table1 and Table1_History, in the event that you want to know the exact audit details of the current row. Updates are designated by a DELETE(D) and INSERT(I) occurring simultaneously in the audit.

If you are talking about a database table, then it's the best practice to actually never delete data from it. You should set a flag column in the table which denotes the operation performed in the corresponding row.
You can set it to 0 for active rows, 1 if the row is deleted. So, basically, you will perform an Update operation while deleting a row. Same can be followed for other CRUD operations.

Related

Updating column in MSSQL quickly

I am using an Microsoft SQL Web server on Amazon RDS. The system is currently generating timeouts when updating one column, I am trying to resolve the issue or at least minimize it. Currently the updates occur when a device calls in and they call in a lot, to the point where a device may call back before the webserver finished the last call.
Microsoft SQL Server Web (64-bit)
Version 13.0.4422.0
I see a couple potential possibilities here. First is the device is calling back before the system finished handling the last call so the same record is being updated multiple times concurrently. The second possibility is that I am running into a row lock or table lock.
The table has about 3,000 records in total.
Note I am only trying to update one column in one row at a time. The other columns are never updated.
I don't need to have the last updated time to be very accurate, would there be any benefit to changing the code to only update the column if say greater than a few minutes or would that just add more load to the server? Any suggestion on how to optimize this? Maybe move it to a function, store procedure, or something else?
Suggested new code:
UPDATE [Devices] SET [LastUpdated] = GETUTCDATE()
WHERE [Id] = #id AND
([LastUpdated] IS NULL OR DATEDIFF(MI, [LastUpdated], GETUTCDATE()) > 2);
Existing update code:
internal static async Task UpdateDeviceTime(ApplicationDbContext db, int deviceId, DateTime dateTime)
{
var parm1 = new System.Data.SqlClient.SqlParameter("#id", deviceId);
var parm2 = new System.Data.SqlClient.SqlParameter("#date", dateTime);
var sql = "UPDATE [Devices] SET [LastUpdated] = #date WHERE [Id] = #Id";
// timeout occurs here.
var cnt = await db.Database.ExecuteSqlCommandAsync(sql, new object[] { parm1, parm2 });
}
Table creation script:
CREATE TABLE [dbo].[Devices](
[Id] [int] IDENTITY(1,1) NOT NULL,
[CompanyId] [int] NOT NULL,
[Button_MAC_Address] [nvarchar](17) NOT NULL,
[Password] [nvarchar](max) NOT NULL,
[TimeOffset] [int] NOT NULL,
[CreationTime] [datetime] NULL,
[LastUpdated] [datetime] NULL,
CONSTRAINT [PK_Devices] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[Devices] ADD CONSTRAINT [DF_Devices_CompanyId] DEFAULT ((1)) FOR [CompanyId]
GO
ALTER TABLE [dbo].[Devices] ADD CONSTRAINT [DF_Devices_TimeOffset] DEFAULT ((-5)) FOR [TimeOffset]
GO
ALTER TABLE [dbo].[Devices] ADD CONSTRAINT [DF_Devices_CreationTime] DEFAULT (getdate()) FOR [CreationTime]
GO
ALTER TABLE [dbo].[Devices] ADD CONSTRAINT [PK_Devices] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

You should look into the cause by using a tool such as profiler or other techniques to detect blocking. I dont see why you would have a problem updating one column your table with only 3,000 records. It might have something to do with your constraints.
If it really is a timing issue, then you can consider in memory OLTP, designed to handle this type of scenario.
Last updated could also be stored in a transaction based table with a link back to this table with a join using Max(UpdatedTime). In this case you would never update just add new records.
You can then either use partitioning or a cleanup routine to keep the size of this transaction table down.
Programming patterns that In-Memory OLTP will improve include
concurrency scenarios, point lookups, workloads where there are many
inserts and updates, and business logic in stored procedures.
https://msdn.microsoft.com/library/dn133186(v=sql.120).aspx

SSIS Parent table Child relation migration

I'm trying to use SSIS to move some data from one SQL server to my Destimation SQL server, the source has a table "Parent" with Identity field ID that is a Foreign key to the "Child" table.
1 - N relation
The question is simple, what is the best way to transfer the data to a different SQL Server with still a parent child relation.
Note: Both ID (Parent and Child) are identity fields that we do not want to migrate since the destination source wont necessary need to have them.
Please share your comments and ideas.
FYI: We create a .Net Code (C#) that does this, we have a query that gets parent data, a query that get childs data and using linq we join the data and we loop parent getting the new ID and inserting as reference of second table. This is working but we want to create the same on SSIS to be able to scale later.

You have to import Parent Table Before Child Table:
First You have to Create Tables On Destination Server, you can achieve this using an query like the following:
CREATE TABLE [dbo].[Tbl_Child](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Parent_ID] [int] NULL,
[Name] [varchar](50) NULL,
CONSTRAINT [PK_Tbl_Child] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Tbl_Parent](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](50) NULL,
CONSTRAINT [PK_Tbl_Parent] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Tbl_Child] WITH CHECK ADD CONSTRAINT [FK_Tbl_Child_Tbl_Parent] FOREIGN KEY([Parent_ID])
REFERENCES [dbo].[Tbl_Parent] ([ID])
GO
ALTER TABLE [dbo].[Tbl_Child] CHECK CONSTRAINT [FK_Tbl_Child_Tbl_Parent]
GO
Add two OLEDB Connection manager (Source & Destination)
Next you have to add a DataFlow Task to Import Parent Table Data From Source. You have to check Keep Identity option
Next you have to add a DataFlow Task to Import Child Table Data From Source. You have to check Keep Identity option
Package May Look like the following
WorkAround: you can disable constraint and import data then enabling it by adding a SQL Task before and after Importing
Disable Constraint:
ALTER TABLE Tbl_Child NOCHECK CONSTRAINT FK_Tbl_Child_Tbl_Parent
Enable Constraint:
ALTER TABLE Tbl_Child CHECK CONSTRAINT FK_Tbl_Child_Tbl_Parent
if using this Workaround it is not necessary to follow an order when importing

Handling large amounts of data in SQL

I've just taken over a project at work, and my boss has asked me to make it run faster. Great.
So I've identified one of the major bottlenecks to be searching through one particular table from our SQL server, which can take up to a minute, sometimes longer, for a select query with some filters on it to run. Below is the SQL generated by C# Entity Framework (minus all the GO statements):
CREATE TABLE [dbo].[MachineryReading](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Location] [geometry] NULL,
[Latitude] [float] NOT NULL,
[Longitude] [float] NOT NULL,
[Altitude] [float] NULL,
[Odometer] [int] NULL,
[Speed] [float] NULL,
[BatteryLevel] [int] NULL,
[PinFlags] [bigint] NOT NULL, -- Deprecated field, this is now stored in a separate table
[DateRecorded] [datetime] NOT NULL,
[DateReceived] [datetime] NOT NULL,
[Satellites] [int] NOT NULL,
[HDOP] [float] NOT NULL,
[MachineryId] [int] NOT NULL,
[TrackerId] [int] NOT NULL,
[ReportType] [nvarchar](1) NULL,
[FixStatus] [int] NOT NULL,
[AlarmStatus] [int] NOT NULL,
[OperationalSeconds] [int] NOT NULL,
CONSTRAINT [PK_dbo.MachineryReading] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
ALTER TABLE [dbo].[MachineryReading] ADD DEFAULT ((0)) FOR [FixStatus]
ALTER TABLE [dbo].[MachineryReading] ADD DEFAULT ((0)) FOR [AlarmStatus]
ALTER TABLE [dbo].[MachineryReading] ADD DEFAULT ((0)) FOR [OperationalSeconds]
ALTER TABLE [dbo].[MachineryReading] WITH CHECK ADD CONSTRAINT [FK_dbo.MachineryReading_dbo.Machinery_MachineryId] FOREIGN KEY([MachineryId])
REFERENCES [dbo].[Machinery] ([Id])
ON DELETE CASCADE
ALTER TABLE [dbo].[MachineryReading] CHECK CONSTRAINT [FK_dbo.MachineryReading_dbo.Machinery_MachineryId]
ALTER TABLE [dbo].[MachineryReading] WITH CHECK ADD CONSTRAINT [FK_dbo.MachineryReading_dbo.Tracker_TrackerId] FOREIGN KEY([TrackerId])
REFERENCES [dbo].[Tracker] ([Id])
ON DELETE CASCADE
ALTER TABLE [dbo].[MachineryReading] CHECK CONSTRAINT [FK_dbo.MachineryReading_dbo.Tracker_TrackerId]
The table has indexes on MachineryId, TrackerId, and DateRecorded:
CREATE NONCLUSTERED INDEX [IX_MachineryId] ON [dbo].[MachineryReading]
(
[MachineryId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
CREATE NONCLUSTERED INDEX [IX_MachineryId_DateRecorded] ON [dbo].[MachineryReading]
(
[MachineryId] ASC,
[DateRecorded] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
CREATE NONCLUSTERED INDEX [IX_TrackerId] ON [dbo].[MachineryReading]
(
[TrackerId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
When we select from this table, we are almost always interested in one machinery or tracker, over a given date range:
SELECT *
FROM MachineryReading
WHERE MachineryId = 2127 AND
DateRecorded > '2016-12-08 00:00:10.009' AND DateRecorded < '2016-12-11 18:32:41.734'
As you can see, it's quite a basic setup. The main problem is the sheer amount of data we put into it - about one row every ten seconds per tracker, and we have over a hundred trackers at the moment. We're currently sitting somewhere around 10-15 million rows. So this leaves me with two questions.
Am I thrashing the database if I insert 10 rows per second (without batching them)?
Given that this is historical data, so once it is inserted it will never change, is there anything I can do to speed up read access?

You have too many non-clustered indexes on the table - which will increase the size of the DB.
If you have an index on MachineryId and DateRecorded - you don't really need a separate one on MachineryId.
With 3 of your Non-Clustered indexes - there are 3 more copies of the data
Clustered VS Non-Clustered
No Include on the Non-Clustered index
When SQL Server is executing your SQL it is first searching the Non-Clustered Index for the required data, then it is going back to the original table (bookmark lookup) Link and getting the rest of the columns as you are doing select *, but the non-clustered index doesn't have all the columns (That is what I think is happening - Can't really tell without the Query Plan)
Include columns in non-clustered index: https://stackoverflow.com/a/1308325/1910735
You should maintain you indexes - by creating a maintenance plan to check for fragmentation and rebuild or reorganize your indexes on weekly basis.
I really think you should have a Clustered index on your MachineryId and DateRecordred instead of a Non-Clustered index. A table can only have one Clustered Index ( this is the order data is stored on the Hard Disk) - as Most of your queries will be in DateRecordred and MachineryId order - it will be better to store them that way,
Also if you really are searching by TrackerId in any query, try adding it to the same Clustered Index
IMPORTANT NOTE: DELETE THE NON-CLUSTERED INDEX in TEST environment before going LIVE
Create a clustered index instead of your non-clustered index, run different queries - Check the performance by comparing the Query Plans and the STATISTICS IO)
Some resources for Index and SQL Query help:
Subscribe to the newsletter here and download the first responder kit:
https://www.brentozar.com/?s=first+responder
It is now open source - but I don't know if it has the actual PDF getting started and help files (Subscribe in the above link anyway - for weekly articles/tutorials)
https://github.com/BrentOzarULTD/SQL-Server-First-Responder-Kit

Tuning is per query, but in any case -
I see you have no partitions and no indexes, which means, no matter what you do. it always results in a full table scan.
For your specific query -
create index MachineryReading_ix_MachineryReading_DateRecorded
on (MachineryReading,DateRecorded)

First, 10 inserts per second is very feasible under almost any reasonable circumstances.
Second, you need an index. For this query:
SELECT *
FROM MachineryReading
WHERE MachineryId = 2127 AND
DateRecorded > '2016-12-08 00:00:10.009' AND DateRecorded < '2016-12-11 18:32:41.734';
You need an index on MachineryReading(MachineryId, DateRecorded). That will probably solve your performance problem.
If you have similar queries for tracker, then you want an index on MachineryReading(TrackerId, DateRecorded).
These will slightly impede the progress of in the inserts. But the overall improvement should be so great, that all will be a big win.

Entity Framework Keeps overwriting attributes. Need other way of solving issue

I have an existing app / dataabse. I have been tasked to add in Entity Framework as part of an upgrade.
I hit a problem where when I generate (or regenerate) the edmx, the code no longer recognises the foreign keys in the database tables, and when the code runs, it complains about missing id's, as, I assume, it is 'guessing' what the foreign keys should be.
I can get round this by adding the following attribute to the Auto generated model definitions.
[ForeignKey("NavigationProperty")]
But then, if / when the edmx is regenerated, all this gets blown away, and has to be re-added.
Although the class that is generated is partial, as these attributes are being added to existing members, I cannot move them to a seperate file.
So, how do I get round this option? Ideally I'd like to ensure that when the edmx is generated it picks up the foreign keys, so that this issue is fixed permanently. If that can't be done, next step is to ask if there is some way of programatically generating these associations, so it is only done once.
Thanks
edit - Added in sample table definition
Here is the code auto generated by SMS. Is tehre anything wrong with the foreign key definition?
CREATE TABLE [dbo].[ShopProductTypes](
[id] [int] IDENTITY(1,1) NOT NULL,
[Shop_Id] [int] NOT NULL,
[Product_Id] [int] NOT NULL,
[CreatedDate] [datetime] NOT NULL,
[CancelledDate] [datetime] NULL,
[Archived] [bit] NOT NULL,
CONSTRAINT [PK_ShopProductTypes] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[ShopProductTypes] WITH CHECK ADD CONSTRAINT [FK_ShopProductTypes_Shop] FOREIGN KEY([Shop_Id])
REFERENCES [dbo].[Shops] ([Id])
GO

I found this:
http://blogs.msdn.com/b/dsimmons/archive/2007/09/01/ef-codegen-events-for-fun-and-profit-aka-how-to-add-custom-attributes-to-my-generated-classes.aspx
It's a bit more involved.

How can I run a query on a dataset that returns different columns to the table?

I'm trying to pull some data from a SQL table in my dataset using C#.
In this case I do not need all the columns just a few specific ones, however as I am not pulling back a column with a mandatory NOT NULL, the copy of the table is throwing the exception
"Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints."
I'm sure I can work around this by returning the unNullable column to my table however I want to avoid returning unnecessary data.
The query I am using which throws this exception is
SELECT DeviceSerialNumber, BuildID, LEMSCredentialsID, LEMSSoftwareID, OwnerID, RestagedDate
FROM tblDevice
WHERE (DeviceSerialNumber = #SerialNumber)
This excludes the mandatory column "tblLocationID". In reality though, this column is only mandatory when considering the database as a whole, not when I just need build and software detail for use in my form.
I am trying to use this query in the following manner.
private DataTable dtDevice;
dtDevice = taDevice.GetDataByDeviceSN_ForRestage(txtDeviceSerial.Text);
I notice when browsing the preview data, Visual Studio draws columns that are not specified in my SQL including the column tblLocationID it does not however populate these columns with data.
Is there anyway I can use this data in a temporary table without importing the non-nullable aspect of the column? preferably by not pulling through the non-selected columns at all?
For completeness, here's the definition (- minus foreign key definitions) of the source table:
CREATE TABLE [dbo].[tblDevice](
[DeviceSerialNumber] [nvarchar](50) NOT NULL,
[Model] [nvarchar](50) NULL,
[ManufactureDate] [smalldatetime] NULL,
[CleanBootDate] [smalldatetime] NULL,
[BuildID] [int] NULL,
[Notes] [nvarchar](3000) NULL,
[AuditID] [int] NULL,
[LocationID] [int] NOT NULL,
[SimID] [int] NULL,
[LEMSCredentialsID] [int] NULL,
[LEMSSoftwareID] [int] NULL,
[OwnerID] [int] NULL,
[RestagedDate] [smalldatetime] NULL,
[Boxed] [bit] NULL,
CONSTRAINT [PK_tblDevice_1] PRIMARY KEY CLUSTERED
([DeviceSerialNumber] ASC) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]

I assume
taDevice
is a tableadapter ? generated with the typed dataset ?
You could also generate the "FillDataByDeviceSN" method (you can generate a Get and a Fill)
then you get (pseudo-ish code):
tblDeviceDataTable dtService = new tblDeviceDataTable();
dtService.tblLocationID.AllowDbNull = true;
taDevice.FillDataByDeviceSN(dtService,txtDeviceSerial.Text);

The column is mandatory because the metadata tells the client that to add any new rows (or alter rows), values for this column have to be provided. If you aren't altering the data and don't need two-way mapping, something more lightweight like a DataReader might be more appropriate.

Here are a few options:
Create an "untyped dataset" for just the fields you want to use.
OR
Change the NullValue property of the NOT NULL field in your typed dataset from the default of "throw exception" to "empty".
OR
Try what Patrick suggested for setting the EnforceConstraints to False.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Best Approach to audit changes in the table - c#

Related

Updating column in MSSQL quickly

SSIS Parent table Child relation migration

Handling large amounts of data in SQL

Entity Framework Keeps overwriting attributes. Need other way of solving issue

How can I run a query on a dataset that returns different columns to the table?

Categories

Resources