I have two tables in two different SQL Server databases on different servers. Each table has the identical number of rows (each ~65000) and they are related by a common ID column. I created an object that has properties read and updated from these two tables.
I read all items from first table and create an instance of my object and then update properties of it. Then I add it into a List. After this is done I read from second table in order to update remaining properties of the object.
What can be the best way to do this? Do you have suggestions? At this moment I loop through my List and for each object in the list get data from the second server. Of course this is consuming time (~15 minutes).
For example is it possible to create a temporary table on one of the servers? The time I consume is in my loop. Time spent while retrieving data from these two servers are OK for me.
A solution would be to create a linked server:
http://msdn.microsoft.com/en-us/library/ms188279.aspx
Using this method, you can reference a database from a different server as if it were on the same server. This should allow you to do a join to complete your dataset.
The best approach is to add one of the servers as a linked server using sp_addlinkedserver then you can run your query against one of the servers (joining both tables by the common field) and get everything in one shot.
Related
I've hit a wall when it comes to adding a new entity object (a regular SQL table) to the Data Context using LINQ-to-SQL. This isn't regarding the drag-and-drop method that is cited regularly across many other threads. This method has worked repeatedly without issue.
The end goal is relatively simple. I need to find a way to add a table that gets created during runtime via stored procedure to the current Data Context of the LINQ-to-SQL dbml file. I'll then need to be able to use the regular LINQ query methods/extension methods (InsertOnSubmit(), DeleteOnSubmit(), Where(), Contains(), FirstOrDefault(), etc...) on this new table object through the existing Data Context. Essentially, I need to find a way to procedurally create the code that would otherwise be automatically generated when you do use the drag-and-drop method during development (when the application isn't running), but have it generate this same code while the application is running via command and/or event trigger.
More Detail
There's one table that gets used a lot and, over the course of an entire year, collects many thousands of rows. Each row contains a timestamp and this table needs to be divided into multiple tables based on the year that the row was added.
Current Solution (using one table)
Single table with tens of thousands of rows which are constantly queried against.
Table is added to Data Context during development using drag-and-drop, so there are no additional coding issues
Significant performance decrease over time
Goals (using multiple tables)
(Complete) While the application is running, use C# code to check if a table for the current year already exists. If it does, no action is taken. If not, a new table gets created using a stored procedure with the current year as a prefix on the table name (2017_TableName, 2018_TableName, 2019_TableName, and so on...).
(Incomplete) While the application is still running, add the newly created table to the active LINQ-to-SQL Data Context (the same code that would otherwise be added using drag-and-drop during development).
(Incomplete) Run regular LINQ queries against the newly added table.
Final Thoughts
Other than the above, my only other concern is how to write the C# code that references a table that may or may not already exist. Is it possible to use a variable in place of the standard 'DB_DataContext.2019_TableName' methodology in order to actually get the table's data into a UI control? Is there a way to simply create an Enumerable of all the tables where the name is prefixed with a year and then select the most current table?
From what I've read so far, the most likely solution seems to involve the use of a SQL add-on like SQLMetal or Huagati which (based solely from what I've read) will generate the code I need during runtime and update the corresponding dbml file. I have no experience using these types of add-ons, so any additional insight into these would be appreciated.
Lastly, I've seen some references to LINQ-to-Entities and/or LINQ-to-Objects. Would these be the components I'm looking for?
Thanks for reading through a rather lengthy first post. Any comments/criticisms are welcome.
The simplest way to achieve what you want is to redirect in SQL Server, and leave your client code alone. At design-time create your L2S Data Context, or EF DbContex referencing a database with only a single table. Then at run-time substitue a view or synonym for that table that points to the "current year" table.
HOWEVER this should not be necessary in the first place. SQL Server supports partitioning, so you can store all the data in a physically separate data structures, but have a single logical table. And SQL Server supports columnstore tables, which can compress and store many millions of rows with excellent performance.
I have two tables in different databases. The tables are exactly alike (same name,same columns,etc). My question is, how can I retrieve new rows from parent table and store into the child table? I need to do so in a click_event of a button.
Thanks in advance.
There are several technologies specifically for this type of scenario:
SQL Replication
Supports unidirectional or bidirectional synchronization
SSIS
Lets you define the mappings of the data, as well as transformations, and attach other code to the process easily
Linked-servers
Allows you to query databases and tables on remote servers as though they are part of the local database. Very easy to setup (just call exec sp_addlinkedserver) and once defined uses nothing but plain old SQL
Since you mention this needs to occur on a button-click then I'd suggest you use linked servers within a stored procedure--they're the simplest option. SSIS would also be suitable, you'd need to execute the package on the button-click.
Resolved it myself using Linked Server. Here is a simple tutorial about how to create a linked server.
After creating linked server, we can query it as follows:
select * from LinkedServerName.DatabaseName.dbo.TableName
Works just perfect!!
Accepting STW's answer as he explains different approaches.
(long and non-optimal solution)
get all id's from first table.
get all id's from second table.
loop through the first array and remove all items that are found in the second.
I have one database server, acting as the main SQL Server, containing a Table to hold all data. Other database servers come in and out (different instances of SQL Server). When they come online, they need to download data from main Table (for a given time period), they then generate their own additional data to the same local SQL Server database table, and then want to update the main server with only new data, using a C# program, through a scheduled service, every so often. Multiple additional servers could be generating data at the same time, although it's not going to be that many.
Main table will always be online. The additional non-main database table is not always online, and should not be an identical copy of main, first it will contain a subset of the main data, then it generates its own additional data to the local table and updates main table every so often with its updates. There could be a decent amount of number of rows generated and/or downloaded. so an efficient algorithm is needed to copy from the extra database to the main table.
What is the most efficient way to transfer this in C#? SqlBulkCopy doesn't look like it will work because I can't have duplicate entries in main server, and it would fail if checking constraints since some entries already exist.
You could do it in DB or in C#. In all cases you must do something like Using FULL JOINs to Compare Datasets. You know that already.
Most important thing is to do it in transaction. If you have 100k rows split it to 1000 rows per transaction. Or try to determine what combination of rows per transaction is best for you.
Use Dapper. It's really fast.
If you have all your data in C#, use TVP to pass it to DB stored procedure. In stored procedure use MERGE to UPDATE/DELETE/INSERT data.
And last. In C# use Dictionary<Tkey, TValue> or something different with O(1) access time.
SQLBulkCopy is the fastest way for inserting data into a table from a C# program. I have used it to copy data between databases and so far nothing beats it speed wise. Here is a nice generic example: Generic bulk copy.
I would use a IsProcessed flag in the table of the main server and keep track of the main table's primary keys when you download data to the local db server. Then you should be able to do a delete and update to the main server again.
Here's how i would do it:
Create a stored procedure on the main table database which receives a user defined table variable with the same structure as the main table.
it should do something like -
INSERT INTO yourtable (SELECT * FROM tablevar)
OR you could use the MERGE statement for the Insert-or-Update functionality.
In code, (a windows service) load all (or a part of) the data from the secondery table and send it to the stored procedure as a table variable.
You could do it in bulks of 1000's and each time a bulk is updated you should mark it in the source table / source updater code.
Can you use linked servers for this? If yes it will make copying of data from and to main server much easier.
When copying data back to the main server I’d use IF EXISTS before each INSERT statement to additionally make sure there are no duplicates and encapsulate all insert statements into transaction so that if an error occurs transaction is rolled back.
I also agree with others on doing this in batches on 1000 or so records so that if something goes wrong you can limit the damage.
I'm writing a program in C# that will grab data from a staging table, and then insert that same data back into their new respective locations in a SQL Server database. The program will do the following steps sequentially:
Select columns from first row of staging table
Store each column as unique variable
Insert data into new respective locations in the database (each value is going to multiple different tables in the DB, and the values are duplicated between many of the tables)
Move to the next Record
Repeat from step 1 until all records have been processed
So is there a way to iterate through the entire record set, storing each result from a column as a unique variable without having to write separate queries for each value that you want to store? There are 51 columns that all have to go somewhere, and I didn't think it would be very efficient to hardcode 51 variables each with a custom query to the database.
I thought about doing this with a multidimensional array, but then that would just be one string with a ton of values. Any advice would be greatly appreciated.
Although you can do this through a .NET application, really this would be much easier to achieve with a SQL statement. SQL has good syntax for moving data between tables:
INSERT INTO [Destination] ([Columns,])
SELECT [Columns,]
FROM [Source]
If you're moving data between databases, you just need to link one of the databases to the other and then run the query. If you're using SQL Server Management Studio, you can follow this article to set up linked servers. Otherwise, you can use the sp_addlinkedserver procedure to register the linked server.
You can create a class that contains a property for each column in your table and use a micro ORM like Dapper to populate a list of instances of those classes from your database. You can then iterate over the list and do your inserts to other tables.
You could even create other classes for your individual inserts and use AutoMapper to create instances of those from your source class.
But... this might all be overkill for what you are trying to achieve.
I have built a backend system that allows a user to add multiple content section, widgets etc.
I want to keep the queries to the SQL server to a minimum for performance reasons, this is my current flow:
I check my main table which widgets have been added.
I run through each row and build the 'batch' sql query that gets content from mulitple tables.
Call the completed list of queries.
I populate in a DataSet.
Now for the problem:
The tables will never be in the same order, and I can't find a way to name the returned tables.
Is it best to just dedicate a column in each returned DataTable to specify what it actually is, and loop through the DataSet?
Or is there actually a way of naming the returned tables?
There is no way to do it automatically, as far as I know. You can give table mappings a try -> http://geekswithblogs.net/dotNETvinz/archive/2009/08/03/why-dataset-creates-tablen-as-the-default-table-name.aspx
Nobody replied with an answer for specifying DataTables in a DataSet, so I ended up adding a Column to each DataTable making it unique and "searchable" solving my problem.