Transferring Excel data into SQL table with different columns - c#

I want user to transfer data from Excel files into SQL using a C# WinForms application one file at a time. The Excel files consist of similar columns, and so there might be some new columns or columns absent. Row data will vary.
For example:
Excel file 1: Name, City State
Excel file 2: Name, City, Zip
Excel file 3: Name, City, County
In my existing SQL table I have columns: Name, City, Population, Schools
How do I insert the new Excel files with similar column names into an existing SQL database?
My thought so far is to copy the new Excel file data into temporary tables, and then insert that into the existing SQL table. The problem is, I don't know how to write C# code (or a SQL query) that would insert new Excel data with more or less columns than the existing SQL table.

You need No-SQL for this purpose, if you need to enter columns that are not already part of the table then sql is not a good option if you use c# to alter table then be careful to the consequences. If you are sure about all possible column names in front then try altering your table before you start insert

This should't be too hard. Export your data from Excel to a staging table table in SQL Server. The C# code may look something like this.
public void importdatafromexcel(string excelfilepath)
{
//declare variables - edit these based on your particular situation
string ssqltable = "tdatamigrationtable";
// make sure your sheet name is correct, here sheet name is sheet1, so you can change your sheet name if have
different
string myexceldataquery = "select student,rollno,course from [sheet1$]";
try
{
//create our connection strings
string sexcelconnectionstring = #"provider=microsoft.jet.oledb.4.0;data source=" + excelfilepath +
";extended properties=" + "\"excel 8.0;hdr=yes;\"";
string ssqlconnectionstring = "server=mydatabaseservername;user
id=dbuserid;password=dbuserpassword;database=databasename;connection reset=false";
//execute a query to erase any previous data from our destination table
string sclearsql = "delete from " + ssqltable;
sqlconnection sqlconn = new sqlconnection(ssqlconnectionstring);
sqlcommand sqlcmd = new sqlcommand(sclearsql, sqlconn);
sqlconn.open();
sqlcmd.executenonquery();
sqlconn.close();
//series of commands to bulk copy data from the excel file into our sql table
oledbconnection oledbconn = new oledbconnection(sexcelconnectionstring);
oledbcommand oledbcmd = new oledbcommand(myexceldataquery, oledbconn);
oledbconn.open();
oledbdatareader dr = oledbcmd.executereader();
sqlbulkcopy bulkcopy = new sqlbulkcopy(ssqlconnectionstring);
bulkcopy.destinationtablename = ssqltable;
while (dr.read())
{
bulkcopy.writetoserver(dr);
}
oledbconn.close();
}
catch (exception ex)
{
//handle exception
}
}
Then, in SQL Server, move the data from the staging table to your final production table. Your SQL may look something like this.
insert into production
select ...
from staging
where not exists
(
select 1 from staging
where staging.key = production.key
)
That's my .02. I think you will have a lot more control over the whole process that way.

If you are reading this question then my answer to this other question may be helpful for you (just create a Class to handle each row information, process the excel file and perform a bulk insert like explained there):
Bulk insert is not working properly in Azure SQL Server
I hope it helps.

Related

Uploading Data to SQL from Excel (.CSV) file using C# windows form app

I am using this method to upload data to SQL.
private void button5_Click(object sender, EventArgs e)
{
string filepath = textBox2.Text;
string connectionString_i = string.Format(#"Provider=Microsoft.Jet.OleDb.4.0; Data Source={0};Extended Properties=""Text;HDR=YES;FMT=Delimited""", Path.GetDirectoryName(filepath));
using (OleDbConnection connection_i = new OleDbConnection(connectionString_i))
{
connection_i.Open();
OleDbCommand command = new OleDbCommand ("Select * FROM [" + Path.GetFileName(filepath) +"]", connection_i);
command.CommandTimeout = 180;
using (OleDbDataReader dr = command.ExecuteReader())
{
string sqlConnectionString = MyConString;
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(sqlConnectionString))
{
SqlBulkCopy bulkInsert = new SqlBulkCopy(sqlConnectionString);
bulkInsert.BulkCopyTimeout = 180;
bulkInsert.DestinationTableName = "Table_Name";
bulkInsert.WriteToServer(dr);
MessageBox.Show("Upload Successful!");
}
}
connection_i.Close();
}
}
I have an Excel sheet in .CSV format of about 1,048,313 entries. That bulk copy method is just working for about 36000 to 60000 entries. I want to ask if there is any way that I can select the first 30000 entries from Excel and upload them to a SQL Server table, then again select next chunk of 30000 rows and upload those to SQL Server, and so on until the last entry has been stored.
Create a datatable to store the values from your csv file that needs to be inserted into your target table. Each column in the datatable would correspond to a data column in the csv file.
Create a custom data type (table-valued) on SQL Server to match your data table, including data type and length. As this post was tagged sql-server and not access, as your sample connection string seems to contradict that.
Using a text reader and a counter variable, populate your datatable with 30,000 records.
Pass the data table to your insert query or stored procedure. The pararameter type is SqlDbType.Structured.
In the event that the job fails and you need to restart, the first step could be to determine the last inserted value from a predefined key in your field. Your could also use a left outer join as part of your insert query to only insert records that do not exist on the table. These are just a few of the more common techniques to restart a failed ETL job.
This technique has some tactical advantages over the bulk copy as it adds flexibility and is less coupled to the target table, thus changes to the table could be less volatile, depending on the nature of the change.

Importing Excel files with a large number of columns header into mysql with c#

i just was just wondering, how do i import large excel files into mysql with c#? My coding experience isn't great and i was hoping if there's anyone out there who could give me some rough idea to start on it. So far, i was able to load excel files into datagridview with the following codes:
string PathConn = " Provider=Microsoft.JET.OLEDB.4.0;Data Source=" + pathTextBox.Text + ";Extended Properties =\"Excel 8.0;HDR=Yes;\";";
OleDbConnection conn = new OleDbConnection(PathConn);
conn.Open();
OleDbDataAdapter myDataAdapter = new OleDbDataAdapter("Select * from [" + loadTextBox.Text + "$]", conn);
table = new DataTable();
myDataAdapter.Fill(table);
but after that, i don't know how i could extract the information and save it into mysql database. Assuming i have a empty scheme created before, how do i work on uploading excel files into mysql? thanks.
I think you would then need to loop over the items in the datatable and do something with them (maybe an insert statement to your DB)
like so
foreach(DataRow dr in table.Rows)
{
string s = dr[0].ToString() // this will be the first column in the datatabl as they are zero indexed
}
this is what i do in data migration scenarios from one SQL Server to another or DataFiles to SQL:
Create the new Table on the destination SQL Server (Column names, Primary Key etc.)
Load existing Data into a DataTable (Thats what you did already)
Now Query the new Table with the DataAdapter into another DataTable (Same as you did with the excel file except you now query the SQL Table.)
Load OldData from 'table' into 'newTable' using DataTable Method "Load()"
string PathConn = (MYSQL Connection String goes here)
OleDbConnection conn = new OleDbConnection(PathConn);
conn.Open();
OleDbDataAdapter myDataAdapter = new OleDbDataAdapter("Select * from [" + loadTextBox.Text + "$]", conn);
newTable = new DataTable();
myDataAdapter.Fill(newTable);
Now use the Load() Method on the new table:
newTable.Load(table.CreateDataReader(), <Specify LoadOption here>)
Matching columns will be imported into the new DataTable. (You can ensure the mapping through using Aliases in the select statements)
After Loading the existing Data into the new Table you will be able to use an DataAdapter to write the changes back to database.
Example for writing data back: ConnString - connection String for DB,
SelectStmt (can use the same as you did on the empty Table before) and provide the newTable as dtToWrite
public static void writeDataTableToServer(string ConnString, string selectStmt, DataTable dtToWrite)
{
using (OdbcConnection odbcConn = new OdbcConnection(ConnString))
{
odbcConn.Open();
using (OdbcTransaction trans = odbcConn.BeginTransaction())
{
using (OdbcDataAdapter daTmp = new OdbcDataAdapter(selectStmt, ConnString))
{
using (OdbcCommandBuilder cb = new OdbcCommandBuilder(daTmp))
{
try
{
cb.ConflictOption = ConflictOption.OverwriteChanges;
daTmp.UpdateBatchSize = 5000;
daTmp.SelectCommand.Transaction = trans;
daTmp.SelectCommand.CommandTimeout = 120;
daTmp.InsertCommand = cb.GetInsertCommand();
daTmp.InsertCommand.Transaction = trans;
daTmp.InsertCommand.CommandTimeout = 120;
daTmp.UpdateCommand = cb.GetUpdateCommand();
daTmp.UpdateCommand.Transaction = trans;
daTmp.UpdateCommand.CommandTimeout = 120;
daTmp.DeleteCommand = cb.GetDeleteCommand();
daTmp.DeleteCommand.Transaction = trans;
daTmp.DeleteCommand.CommandTimeout = 120;
daTmp.Update(dtToWrite);
trans.Commit();
}
catch (OdbcException ex)
{
trans.Rollback();
throw ex;
}
}
}
}
odbcConn.Close();
}
}
Hope this helps.
Primary Key on the newTable is necessary, otherwise you might get a CommandBuilder exception.
BR
Therak
Your halfway there, You have obtained the information from the Excel spreadsheet and have it stored in a DataTable.
The first thing you need to do before you look to import a significant amount of data into SQL is validate what you have read in from the spreadsheets.
You have a few options, one of which is do something very similar to how you read in your data and that is use a SQLAdapter to perform am INSERT into the SQL Database. All your really needing to do in this case is create a new connection and write the INSERT command.
There are many example of doing this on here.
Another option which i would use, is LINQ to CSV (http://linqtocsv.codeplex.com/).
With this you can load all of your data into class objects which makes it easier to validate each object before you perform your INSERT into SQL.
If you have limited experience then use the SQLAdapter to connect to you DB.
Good Luck

How duplicate entries be prevented while using SqlBulkCopy while importing huge data?

I want to know how to prevent duplicate entry to database table in case the table already have a record for that field.
As in my table column name: Website is unique column. And my uploading excel file may have same record with new data or maybe its complete duplicate so based on Column name Website i want to prevent entry of that duplicate entry and then enter another next record and this goes on.
I hope its clear, here is my code:
protected void btnSend_Click(object sender, EventArgs e)
{
//file upload path
string path = fileuploadExcel.PostedFile.FileName;
//Create connection string to Excel work book
string excelConnectionString = #"Provider=Microsoft.ACE.OLEDB.12.0;Data Source='C:\File.xlsx';Extended Properties=Excel 12.0;Persist Security Info=False";
//Create Connection to Excel work book
OleDbConnection excelConnection = new OleDbConnection(excelConnectionString);
//Create OleDbCommand to fetch data from Excel
OleDbCommand cmd = new OleDbCommand("Select * from [Sheet1$]", excelConnection);
excelConnection.Open();
OleDbDataReader dReader;
DataTable table = new DataTable();
dReader = cmd.ExecuteReader();
table.Load(dReader);
SqlBulkCopy sqlBulk = new SqlBulkCopy(strConnection);
//Give your Destination table name
sqlBulk.DestinationTableName = "TableName";
sqlBulk.WriteToServer(table);
excelConnection.Close();
int numberOfRowsInserted = table.Rows.Count;// <-- this is what was written.
string message = string.Format("<script>alert({0});</script>", numberOfRowsInserted);
ScriptManager.RegisterStartupScript(this, this.GetType(), "scr", message, false);
}
How about modifying the query you pass to OleDbCommand to select only the values of Website you need?
If the entire row is duplicate - you can use distinct. See How to select unique records by SQL for an example.
If only this column repeats and other columns are not relevant, then distinct may not work (it depends on the DB) and you will have to use GROUP BY and select the first row of each group.
One thing you can do is to load it into a temporary table first that has no restrictions. Then you can remove all records that do not match your business requirements (such as duplicate keys) and log what records you removed and why (optional, but can be useful). Finally, you can insert/merge the temp table into the final table.
Alternatively, you can load everything into your temporary table and put the business logic in the insert/merge statement, only inserting the valid records that way.

how to update datatable in c# with code?

I want move data from database to another database.
I write 2 function. function 1 : I fill table from database1 into a datatable and named this DT
in function 2 I fill table in database2 with Dt and named its dtnull
I update dtnull in database 2
function 2:
{
SqlDataAdapter sda = new SqlDataAdapter();
sda.SelectCommand = new SqlCommand();
sda.SelectCommand.Connection = objconn;
sda.SelectCommand.CommandText = "Select * from " + TableName + "";
DataTable dtnull = new DataTable();
sda.Fill(dtnull);
SqlCommandBuilder Builder = new SqlCommandBuilder();
Builder.DataAdapter = sda;
Builder.ConflictOption = ConflictOption.OverwriteChanges;
string insertCommandSql = Builder.GetInsertCommand(true).CommandText;
foreach (DataRow Row in Dt.Rows)
{
dtnull.ImportRow(Row);
}
sda.Fill(dtnull);
sda.Update(dtnull);
}
If you need to copy SQL database then just back it up and restore. Alternatively use DTS services.
If it's just a few tables I think you can
right click on the table you want in the SQL Management studio
generate a create script to your clipboard
execute it
Go back to your original table and select all the rows
copy them
go to your new table and paste
No need to make this harder than it is.
You don't really need to use an update for this. You might try out this solution, it might be the easiest way for you do this.
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx
If you would like a LINQ solution, I could provide you with one.
There is a lot that is left unexplained. For example, do the source table and target table have the same column structure?
Can you see both database from the same SqlConnection (i.e. are they on the same machine)? If so, you can do it all in one SQL statement. Assuming you want to copy the data from table T1 in databse DB1 to table T2 in database DB2, you would write
insert DB2.dbo.T2 select * from DB1.dbo.T1
Excecute using ExecuteNonQuery.
If the databases require different SqlConnections, I would read the data from the source using a SqlDataReader and update the target row by row. I think it would be faster than using a SqlDataAdapter and DataTable since they require more structure and memory. The Update command writes the data row by row in any event.

Using temporary table in c#

I read an excel sheet into a datagrid.From there , I have managed to read the grid's rows into a DataTable object.The DataTable object has data because when I make equal a grid's datasource to that table object , the grid is populated.
My Problem : I want to use the table object and manipulate its values using SQL server,(i.e. I want to store it as a temporary table and manipulate it using SQL queries from within C# code and , I want it to return a different result inte a grid.(I don't know how to work with temporary tables in C#)
Here's code to execute when clicking button....
SqlConnection conn = new SqlConnection("server = localhost;integrated security = SSPI");
//is connection string incorrect?
SqlCommand cmd = new SqlCommand();
//!!The method ConvertFPSheetDataTable Returns a DataTable object//
cmd.Parameters.AddWithValue("#table",ConvertFPSheetDataTable(12,false,fpSpread2_Sheet1));
//I am trying to create temporary table
//Here , I do a query
cmd.CommandText = "Select col1,col2,SUM(col7) From #table group by col1,col2 Drop #table";
SqlDataAdapter da = new SqlDataAdapter(cmd.CommandText,conn);
DataTable dt = new DataTable();
da.Fill(dt); ***// I get an error here 'Invalid object name '#table'.'***
fpDataSet_Sheet1.DataSource = dt;
//**NOTE:** fpDataSet_Sheet1 is the grid control
Change your temp table from #table to ##table in both places.
Using ## means a global temp table that stays around. You'll need to Drop it after you have completed your task.
Command = " Drop Table ##table"
Putting the data into a database will take time - since you already have it in memory, perhaps LINQ-to-Objects (with DataSetExtensions) is your friend? Replace <int> etc with the correct types...
var query = from row in table.Rows.Cast<DataRow>()
group row by new
{
Col1 = row.Field<int>(1),
Col2 = row.Field<int>(2)
} into grp
select new
{
Col1 = grp.Key.Col1,
Col2 = grp.Key.Col2,
SumCol7 = grp.Sum(x => x.Field<int>(7))
};
foreach (var item in query)
{
Console.WriteLine("{0},{1}: {2}",
item.Col1, item.Col2, item.SumCol7);
}
I don't think you can make a temp table in SQL the way you are thinking, since it only exists within the scope of the query/stored procedure that creates it.
If the spreadsheet is a standard format - meaning you know the columns and they are always the same, you would want to create a Table in SQL to put this file into. There is a very fast way to do this called SqlBulkCopy
// Load the reports in bulk
SqlBulkCopy bulkCopy = new SqlBulkCopy(connectionString);
// Map the columns
foreach(DataColumn col in dataTable.Columns)
bulkCopy.ColumnMappings.Add(col.ColumnName, col.ColumnName);
bulkCopy.DestinationTableName = "SQLTempTable";
bulkCopy.WriteToServer(dataTable);
But, if I'm understanding your problem correctly, you don't need to use SQL server to modify the data in the DataTable. You c an use the JET engine to grab the data for you.
// For CSV
connStr = string.Format("Provider=Microsoft.JET.OLEDB.4.0;Data Source={0};Extended Properties='Text;HDR=Yes;FMT=Delimited;IMEX=1'", Folder);
cmdStr = string.Format("SELECT * FROM [{0}]", FileName);
// For XLS
connStr = string.Format("Provider=Microsoft.JET.OLEDB.4.0;Data Source={0}{1};Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'", Folder, FileName);
cmdStr = "select * from [Sheet1$]";
OleDbConnection oConn = new OleDbConnection(connStr);
OleDbCommand cmd = new OleDbCommand(cmdStr, oConn);
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
oConn.Open();
da.Fill(dataTable);
oConn.Close();
Also, in your code you ask if your connection string is correct. I don't think it is (but I could be wrong). If yours isn't working try this.
connectionString="Data Source=localhost\<instance>;database=<yourDataBase>;Integrated Security=SSPI" providerName="System.Data.SqlClient"
Pardon me, if I have not understood what you exactly want.
If you want to perform SQL query on excel sheet, you could do it directly.
Alternatively, you can use SQL Server to query excel (OPENROWSET or a function which I dont remember right away). Using this, you can join a sql server table with excel sheet
Marc's suggestion is one more way to look at it.
Perhaps you could use a DataView. You create that from a DataTable, which you already have.
dv = new DataView(dataTableName);
Then, you can filter (apply a SQL WHERE clause) or sort the data using the DataView's methods. You can also use Find to find a matching row, or FindRows to find all matching rows.
Some filters:
dv.RowFilter = "Country = 'USA'";
dv.RowFilter = "EmployeeID >5 AND Birthdate < #1/31/82#"
dv.RowFilter = "Description LIKE '*product*'"
dv.RowFilter = "employeeID IN (2,4,5)"
Sorting:
dv.Sort = "City"
Finding a row: Find the customer named "John Smith".
vals(0)= "John"
vals(1) = "Smith"
i = dv.Find(vals)
where i is the index of the row containing the customer.
Once you've applied these to the DataView, you can bind your grid to the DataView.
Change the command text from
Select col1,col2,SUM(col7) From #table group by col1,col2
to
Select col1,col2,SUM(col7) From ##table group by col1,col2

Categories

Resources