Linq to Sql check if multiple records exist - c#

When using Linq to Sql and updating something like a cross reference table where they could already be records in the table that would just need to stay there, records that will change and records that could be removed. What is the best practice as to handle this? I am thinking delete all and recreate. Is that bad?
Should I delete the reference records and repopulate all of them. Naturally removing what is no longer needed and creating what is needed.
or
Should i attempt to perform some type of check and remove what is old with what is being added
or
what is a better way?

Linq to SQL is not the best tool for scenario's like this.
Basically you will have to write update/insert/delete all by yourself. You can use the exists() any() etc. to create the sets, but the resulting SQL will be all individual inserts, updates.
It is query language after all.
In your case, I would do a merge through a stored procedure and call that.

A MERGE statement (StackOverFlow Merge Example) might work out best in this situation. It will allow for multiple rows to be manipulated at the same time at your discretion.

Related

C# Entity Framework: making sure each entry is unique

I'm making an API call with C#, getting back a JSON, breaking it down into nested objects, breaking each object into fields and putting the fields into an SQL Server table.
There is one field (OnlineURL) which should be unique.
What is an efficient way of achieving this goal? I currently make a database call for every nested object I pull out of the JSON and then use an if statement. But this is not efficient.
Database Layer
Creating a unique index/constraint for the OnlineURL field in the database will enforce the field being unique no matter what system/codebase references it. This will result in applications erroring on inserts of new records where the OnlineURL already exists or updating record X to an OnlineURL that is already being used by record Y.
Application Layer
What is the rule when OnlineURL already exists? Do you reject the data? Do you update the matching row? Maybe you want to leverage a stored procedure that will insert a new row based on OnlineURL or update the existing one. This will turn a 2 query process into a single query, which will have an impact on large scale inserts.
Assuming your application is serial and the only one working against the database. You could also keep a local cache of OnlineURLs for use during your loop, read in the list once from the database, check each incoming record against it and then add each new OnlineURL you insert into the list. To read in the initial list is only a single query and each comparison is done in memory.
Create an index for that field and it will be.
It is necessary to check the uniqueness and that can't be fullfilled if you don't query the data. That means you will have to check the entire data in that column. Your first option is to improve the query with an index with a fill factor of 80 so you can avoid unnecessary page splits caused by the inserts.
Another option is to use caching and depends on your setup.
You could load the entire column in memory and check for the uniqueness there. Or you could use a distributed cache like Redis. Either way analyze the complexity costs and probably you'll that the index is the most ergonomic option.

Inserting/updating huge amount of rows into SQL Server with C#

I have to parse a big XML file and import (insert/update) its data into various tables with foreign key constraints.
So my first thought was: I create a list of SQL insert/update statements and execute them all at once by using SqlCommand.ExecuteNonQuery().
Another method I found was shown by AMissico: Method
where I would execute the sql commands one by one. No one complained, so I think its also a viable practice.
Then I found out about SqlBulkCopy, but it seems that I would have to create a DataTable with the data I want to upload. So, SqlBulkCopy for every table. For this I could create a DataSet.
I think every option supports SqlTransaction. It's approximately 100 - 20000 records per table.
Which option would you prefer and why?
You say that the XML is already in the database. First, decide whether you want to process it in C# or in T-SQL.
C#: You'll have to send all data back and forth once, but C# is a far better language for complex logic. Depending on what you do it can be orders of magnitude faster.
T-SQL: No need to copy data to the client but you have to live with the capabilities and perf profile of T-SQL.
Depending on your case one might be far faster than the other (not clear which one).
If you want to compute in C#, use a single streaming SELECT to read the data and a single SqlBulkCopy to write it. If your writes are not insert-only, write to a temp table and execute as few DML statements as possible to update the target table(s) (maybe a single MERGE).
If you want to stay in T-SQL minimize the number of statements executed. Use set-based logic.
All of this is simplified/shortened. I left out many considerations because they would be too long for a Stack Overflow answer. Be aware that the best strategy depends on many factors. You can ask follow-up questions in the comments.
Don't do it from C# unless you have to, it's a huge overhead and SQL can do it so much faster and better by itself
Insert to table from XML file using INSERT INTO SELECT

Entity Framework and loading all database entries vs loading each as required

I have a scenario where I need to synchronize a database table with a list (XML) from an external system.
I am using EF but am not sure which would be the best way to achieve this in terms of performance.
There are 2 ways to do this as I see, but neither seem to be efficient to me.
Call Db each time
-Read each entry from the XML
-Try and retrieve the entry from the list
-If no entry found, add the entry
-If found , update timestamp
-At end of loop, delete all entries with older timestamp.
Load All Objects and work in memory
Read all EF objects into a list.
Delete all EF objects
Add item for each item in the XML
Save Changes to Db.
The lists are not that long, estimating around 70k rows. I don't want to clear the db table before inserting the new rows, as this table is a source for data from a webservice, and I don't want to lock the table while its possible to query it.
If I was doing this in T-SQL i would most likely insert the rows into a temp table, and join to find missing and deleted entries, but I have no idea how the best way to handle this in Entity Framework would be.
Any suggestions / ideas ?
The general problem with Entity Framework is that, when changing data, it will fire a query for each changed record anyway, regardless of lazy or eager loading. So by nature, it will be extremely slow (think of factor 1000+).
My suggestion is to use a stored procedure with a table valued parameter and ignore Entity Framework all together. You could use a merge statement.
70k rows is not much, but 70k insert/update/delete statements is always going to be very slow.
You could test it and see if the performance is managable, but intuition says entity framework is not the way to go.
I would iterate over the elements in the XML and update the corresponding row in the DB one at a time. I guess that's what you meant with your first option? As long as you have a good query plan to select each row, that should be pretty efficient. Like you said, 70k rows isn't that much so you are better off keeping the code straightforward rather than doing something less readable for a little more speed.
It depends. It's ok to use EF if there'll be not many changes (say less than hundreds). Otherwise, need bulk insert into DB and merge rows inside database.

update same datatable using Linq without loop

I have one datatable with 26 columns in it.
I need to update specific column based on filter.
but I dont want to do it using iteration because it's having thousands of records. It waill affects performance.
is there any way to do that.
I am new for linq so I searched for that but not getting proper solution.
There are some solutions but I can not understand it.
Please if anyone having solution?
This is where you have to either drop into ADO or seriously customize linq or EF.
Bulk inserts and updates are not something it does nicely.
Is batch or bulk insert possible in Linq 2 Sql ?
the same goes for EF.
Multiple row update is not supported by EF. For this better you use stored procedure. This is the reason EF has provided support for executing stored procedure. Use it and enjoy :)

Fastest way to insert many records in the database

I am inserting record in the database (100,1.000, 10.000 and 100.000) using 2 methods
(is a table with no primary key and no index)
using a for and inserting one by one
using a stored procedure
The times are, of course better using stored procedure.
My questions are: 1)if i use a index will the operation go faster and 2)Is there any other way to make the insertion
PS:I am using ibatis as ORM if that makes any difference
Check out SqlBulkCopy.
It's designed for fast insertion of bulk data. I've found it to be fastest when using the TableLock option and setting a BatchSize of around 10,000, but it's best to test the different scenarios with your own data.
You may also find the following useful.
SQLBulkCopy Performance Analysis
No, I suspect that, if you use an index, it will actually go slower. That's because it has to update the index as well as inserting the data.
If you're reasonably certain that the data won't have duplicate keys, add the index after you've inserted all the rows. That way, it built once rather than being added to and re-balanced on every insert.
That's a function of the DBMS. I know it's true for the one I use frequently (which is not SQLServer).
I know this is slightly off-topic, but it's a shame you're not using SQL Server 2008, as there's been a massive improvement in this area with the advent of the MERGE statement and user-defined table types (which allow you to pass-in a 'table' of data to the stored procedure or statement so you can insert/update many records in one go).
For some more information, have a look at http://www.sql-server-helper.com/sql-server-2008/merge-statement-with-table-valued-parameters.aspx
It was already discussed : Insert data into SQL server with best performance.

Categories

Resources