I am currently working on an asp.net application which allows a user to upload a CSV file and read and update values in SQL table. Typically there could be 500 records and 8 parameters.
Using sqlbulkcopy, CSV file can be directly uploaded into table. But this option requires to write all logic in another stored procedure and call it after bulk copy.
If there are any other approaches I could follow to acheieve this, please let me know.
Note: Table parameter is not an option, since its SQL Server 2005.
Related
Summary
I have a requirement to modify the content of Database based on some input .txt file that modify thousands of records in database, for each being a business transaction (daily around 50 transaction will performed).
My application will read that .txt file and perform the modification to data in SQL Server database.
The current application that imports the data from DB and perform the data modification in memory (DataTable) and later after that push back to database it does so using SqlBulkCopy into a SQL Server 2008 database table.
Does anyone know of a way to use SqlBulkCopy while preventing duplicate rows, without a primary key? Or any suggestion for a different way to do this?
Already implemented and dropped for performance issues.
Before this I was using SQL statements was generated automated for data modifications but it's really slow, so I thought of loading complete database table into a DataTable (C#) memory perform look up and modifications and accept the changes to that memory ...
One more approach to implement, give me some feedback about my new approach please right me if I am wrong ..
Steps
load to database table into C# DataTable (fill DataTable using SqlDataAdapter)
Once DataTable is in memory, perform data modifications on it
Load again the base table in database and compare in memory prepare the
non-existing records and finally perform insert.
push the DataTable to memory using Bulk insert
I cant have Primary key!!!!
Please give me any suggestions for my workflow. and whether i am in right approach to deal my problem?.
I created a C# program using SQL Server 2008 Express.
There is some data that must exist in the database table initially, for the C# application to run correctly.
I would like to transfer this data into the table the first time the C# application executes.
But I don't want this SQL data (record data) read from my C# code (I don't want to load this data from hard-coded C# code).
How does SQL Server store its record data, and how can I transfer the initial data into the database?
I'd try to not rely on a database being existent for your application to work. You're coupling yourself to the database which probably isn't a good idea. Do you think maybe you could execute a stored procedure from your application which would fill out the tables you're relying on? That way your DB creation code wouldn't exist in your application. This would however still mean your application is dependant on the database to function correctly.
I would recommend one of two plans, the latter being the cleaner.
Plan 1
Create a SQL script which will create and insert all the required data for the application to work and place this in a stored procedure
When your application starts first time it will check a configuration file to see if the program has ran before, if not, execute the stored procedure to create the required data
Alter an external configuration file (which could be password protected) which will indicate whether the stored procedure has already been run, could just be a simple bool
Each subsequent time the application runs it will check to see whether it's run before and won't execute the stored proc if it has
Plan 2
Create a SQL script which will create and insert all the required data for the application to work and place this in a stored procedure
Use an installer to deploy your application with a custom action to create the database, explained here
You definitely need to have the SQL insert/create script at least partially hardcoded somewhere. Either in an SQL script sitting in your app's directory (not recommended because the user might tamper with it) or in your application's ressources (a bit more secure as the user would need to hex-edit the EXE or DLL).
Tried this ? http://social.msdn.microsoft.com/Forums/en/adodotnetdataproviders/thread/43e8bc3a-1132-453b-b950-09427e970f31
I think your question is so simple and you want to fill the tables before you run the program.
simplely you can open sql server managment studio and and connect with your instance name,then expand Database list and find your database that you have created.expand that and in Tables tree find the table you want to add data to it.right click on it and do Edit
also Redgate has a product to fill tables called data generator,that can use to fill your tables with so many data
I realize you can use the upsize wizard in access to convert this normally but as this is a server side process where we are getting the mdb files from a third party on a daily basis, I have to be able to ingest these with a no touch architecture.
Currently, I'm about to set out to write it all by hand (ugh) where I read the access database through a datasource and punch it up into sql server through bulk inserts or entity framework. I really wish there were a better way to do this though. I'm willing to entertain lots of creative methods as there are a LOT of tables and a TON of data.
There are a number of methods that come to mind, which do all indeed involve custom programming, but should be relatively simple and straightforward to implement.
From another Access DB, open the source DB programmatically (i.e., with VBA). Create linked tables to SQL backend in source DB. Copy the data from the source DB to linked table (using insert dest select * from source).
Use OPENDATASET or OPENROWSOURCE with SQL Server to directly connect to the Access DB and copy the data. You can use again insert dest select * from source to copy the data, or select * into dest from source to create a new table from the source data. This involves tweaking some system settings on sql server since it's not enabled by default, but a few google searches should get you started.
From a .NET program, use SqlBulkCopy (which is the .NET class for automating bcp) to upload data from the Access database. Just work with the data directly with ADO.Net, as there's no reason to build an entire EF layer just for migrating data from one source to another.
I have used variations of all three methods above in various projects, but for moving a large number of tables, I have found option #2 to be relatively efficient. It will involve some dynamic SQL code if your table names are dynamic on a daily basis, but if they are static, you should only have to write the logic once and use a parameter for the filename to read from.
I have about 20 .csv files which are around 100-200mb each.
They each have about 100 columns.
90% of the columns of each file are the same; however, some files have more columns and some files have less columns.
I need to import all of these files into one table in a sql server 2008 database.
If the field does not exist, I need it to be created.
question: What should be the process with this import? How do I more efficiently and quickly import all of these files into one table in a database, and make sure that if a field does not exist, then it is created? Please also keep in mind that the same field might be in a different location. For example, CAR can be in field AB in one csv whereas the same field name (CAR) can be AC in the other csv file. The solution can be SQL or C# or both.
You may choose a number of options
1. Use the DTS package
2. Try to produce one uniform CSV file, get the db table in sync with its columns and bulk insert it
3. Bulk insert every file to its own table, and after that merge the tables into the target table.
I would recommend looking at the BCP program which comes with SQL Server and is intended to help with jobs just like this:
http://msdn.microsoft.com/en-us/library/aa337544.aspx
There are "format files" which allow you to specify which CSV columns go to which SQL columns.
If you are more inclined to use C#, have a look at the SqlBulkCopy class:
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx
Also take a look at this SO thread, also about importing from CSV files into SQL Server:
SQL Bulk import from CSV
I recommend writing a small c# application that reads each of the CSV file headers and stores a dictionary of the columns needed and either outputs a 'create table' statement or directly runs a create table operation on the database. Then you can use Sql Management Studio to load the 20 files individually using the import routine.
Use SqlBulkCopy class in System.Data.SqlClient
It facilitates bulk data transfer. only catch it wont work with DataTime DB column
Less of an answer and more of a direction, but here I go. The way I would do it is first enumerate the column names from both the CSV files and the DB, then make sure the ones from your CSV all exist in the destination.
Once you have validated and/or created all the columns, then you can do your bulk insert. Assuming you don't have multiple imports happening at the same time, you could cache the column names from the DB when you start the import, as they shouldn't be changing.
If you will have multiple imports running at the same time, then you will need to make sure you have a full table lock during the import, as race conditions could show up.
I do a lot of automated imports for SQL DBs, and I haven't ever seen what you asked, as it's an assumed requirement that one knows the data that is coming in to the DB. Not knowing columns ahead of time is typically a very bad thing, but it sounds like you have an exception to the rule.
Roll your own.
Keep (or create) a runtime representation of the target table's columns in the database. Before importing each file, check to see if the column exists already. If it doesn't, run the appropriate ALTER statement. Then import the file.
The actual import process can and probably should be done by BCP or whatever Bulk protocol you have available. You will have to do some fancy kajiggering since the source data and destination align only logically, not physically. So you will need BCP format files.
There are several possibilities that you have here.
You can use SSIS if it is available to you.
In Sql Server you can use SqlBulkCopy to bulk insert in a staging table where you will insert the whole .csv file
and then use a stored procedure with possibly MERGE statement in it
to place each row where it belongs or create a new one if it doesn't
exist.
You can use C# code to read the files and write them using SqlBulkInsert or EntityDataReader
For those data volumes, you should use an ETL. See this tutorial.
ETLs are designed for large amount of data manipulation
Here I am facing a problem that I want to pass a dataset to a SQL Server stored procedure and I don't have any idea about it and there is no alternate solution (I think so ) to do that, let me tell what I want ...
I have an Excel file to be read , I read it successfully and all data form this excel work book import to a dataset. Now this data needs to be inserted into two different tables and there is too many rows in Excel workbook so it is not good if I run it from code behind that's why I want to pass this dataset to stored procedure and than ........
please suggest me some solution .
Not knowing what database version you're working with, here are a few hints:
if you need to read the Excel file regularly, and split it up into two or more tables, maybe you need to use something like SQL Server Integration Services for this. With SSIS, you should be able to achieve this quite easily
you could load the Excel file into a temporary staging table, and then read the data from that staging table inside your stored procedure. This works, but it gets a bit messy when there's a chance that multiple concurrent calls need to be handled
if you're using SQL Server 2008 and up, you should look at table-valued parameters - you basically load the Excel file into a .NET DataSet and pass that to the stored proc as a special parameter. Works great, but wasn't available in SQL Server before the 2008 release
since you're using SQL Server 2005 and table-valued parameters aren't available, you might want to look at Erland Sommarskog's excellent article Arrays and Lists in SQL SErver 2005 - depending on how big your data set is, one of his approaches might work for you (e.g. passing as XML which you parse/shred inside the stored proc)