My requirement is that i have 2l data in excel/CSV file which has email id in each row, I have to import those data at one shot, i.e bulk copy to SQL server by verifying one data(email) at a time.
You could also use Microsoft Visual Studio's Business Intelligence tools. By creating a SSIS (SQL Server Integration Services) project you can use various drag and drop tools to create "packages" (or jobs if you wish) which you can execute to perform jobs like these.
You can import data from a wide variety of data sources including Excel, CSV, MySQL, SQL Server and Hadoop to name a few.
You can also write that data from those sources to not only SQL Server but a wide variety of other data destinations as well.
I am using Visual Studio 2015 with the Business Intelligence packages installed.
What I would recommend is:
Start Visual Studio and open a new SSIS (SQL Server Integration Services) project.
Under the control flow tab. Add a new Data Flow task to the control flow area.
Double click on the control flow item or navigate to the Data Flow tab.
Make sure you data flow item is selected. (Should be if you double clicked it.)
From there you can use the Source and Destination Assistants to transport your data.
Once done setting up you source, destination, data transformations and checks. You can hit Start and it will execute the package.
P.S: You can also use the script component in the data flow tab to write custom C# script if you want to.
If we had an example of the Schema (Table structure) you were transporting from and to it would have helped with providing an example.
Best of luck
I have specified the connection strings for the Excel files of both 2003 and 2007 or higher formats in the Web.Config file.
<add name = "Excel03ConString" connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties='Excel 8.0;HDR=YES'"/>
<add name = "Excel07+ConString" connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 8.0;HDR=YES'"/>
You will need to import the following namespaces.
using System.IO;
using System.Data;
using System.Data.OleDb;
using System.Data.SqlClient;
using System.Configuration;
Add the following code :
//Upload and save the file
string excelPath = Server.MapPath("~/Files/") + Path.GetFileName(FileUpload1.PostedFile.FileName);
FileUpload1.SaveAs(excelPath);
string conString = string.Empty;
string extension = Path.GetExtension(FileUpload1.PostedFile.FileName);
switch (extension)
{
case ".xls": //Excel 97-03
conString = ConfigurationManager.ConnectionStrings["Excel03ConString"].ConnectionString;
break;
case ".xlsx": //Excel 07 or higher
conString = ConfigurationManager.ConnectionStrings["Excel07+ConString"].ConnectionString;
break;
}
conString = string.Format(conString, excelPath);
using (OleDbConnection excel_con = new OleDbConnection(conString))
{
excel_con.Open();
string sheet1 = excel_con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null).Rows[0]["TABLE_NAME"].ToString();
DataTable dtExcelData = new DataTable();
//[OPTIONAL]: It is recommended as otherwise the data will be considered as String by default.
dtExcelData.Columns.AddRange(new DataColumn[3] { new DataColumn("Id", typeof(int)),
new DataColumn("Name", typeof(string)),
new DataColumn("Salary",typeof(decimal)) });
using (OleDbDataAdapter oda = new OleDbDataAdapter("SELECT * FROM [" + sheet1 + "]", excel_con))
{
oda.Fill(dtExcelData);
}
excel_con.Close();
string consString = ConfigurationManager.ConnectionStrings["constr"].ConnectionString;
using (SqlConnection con = new SqlConnection(consString))
{
using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(con))
{
//Set the database table name
sqlBulkCopy.DestinationTableName = "dbo.tblPersons";
//[OPTIONAL]: Map the Excel columns with that of the database table
sqlBulkCopy.ColumnMappings.Add("Id", "PersonId");
sqlBulkCopy.ColumnMappings.Add("Name", "Name");
sqlBulkCopy.ColumnMappings.Add("Salary", "Salary");
con.Open();
sqlBulkCopy.WriteToServer(dtExcelData);
con.Close();
}
}
}
Related
Problem Stement
I am trying to completely automate (via parametrization) my SSIS package. It uses the data flow that reads a .csv file and inserts its contents into SQL Server table. So, I need to do this without using the data flow task.
New setup
I have replaced the data flow task with a script task that does the same thing.
The .csv file is loaded into the DataTable object and then inserted into the destination table using SqlBulkCopy class and SqlConnection instance.
public void Main()
{
var atlas_source_application = (string)Dts.Variables["$Project::Atlas_SourceApplication"].Value;
var ssis_package_name = (string)Dts.Variables["System::PackageName"].Value;
var csv_path = (string)Dts.Variables["$Project::SVM_Directory"].Value;
var atlas_server_name = (string)Dts.Variables["$Project::AtlasProxy_ServerName"].Value;
var atlas_init_catalog_name = (string)Dts.Variables["$Project::AtlasProxy_InitialCatalog"].Value;
var connname = #"Data Source=" + atlas_server_name + ";Initial Catalog=" + atlas_init_catalog_name + ";Integrated Security=SSPI;";
var csv_file_path = #"" + csv_path + "\\" + ssis_package_name + ".csv";
try
{
DataTable csvData = new DataTable();
// Part I - Read
string contents = File.ReadAllText(csv_file_path, System.Text.Encoding.GetEncoding(1252));
TextFieldParser parser = new TextFieldParser(new StringReader(contents));
parser.HasFieldsEnclosedInQuotes = true;
parser.SetDelimiters(",");
string[] fields;
while (!parser.EndOfData)
{
fields = parser.ReadFields();
if (csvData.Columns.Count == 0)
{
foreach (string field in fields)
{
csvData.Columns.Add(new DataColumn(string.IsNullOrWhiteSpace(field.Trim('\"')) ? null : field.Trim('\"'), typeof(string)));
}
}
else
{
csvData.Rows.Add(fields.Select(item => string.IsNullOrWhiteSpace(item.Trim('\"')) ? null : item.Trim('\"')).ToArray());
}
}
parser.Close();
// Part II - Insert
using (SqlConnection dbConnection = new SqlConnection(connname))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = "[" + atlas_source_application + "].[" + ssis_package_name + "]";
foreach (var column in csvData.Columns)
{
s.ColumnMappings.Add(column.ToString(), column.ToString());
}
s.WriteToServer(csvData);
}
}
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
Dts.Events.FireError(0, "Something went wrong ", ex.ToString(), string.Empty, 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
}
This setup works perfectly fine on my local computer. However, once the package is deployed on the server, the insertion part breaks since the database is nowhere to be found (at least that's what it tells me).
Therefore, I tried to imitate the visual SSIS component inside the data flow task [Destination OLE DB] that uses a connection manager.
Old Setup
OLE DB connection manager setup
OLE DB destination setup
This setup uses OLE DB driver with "SQL Server Native Client 11.0" provider (or simply SQLNCLI11.1), "SSPI" integrated security, "Table or view - fast load" mode of access to data. This setup works perfectly fine locally and on the server.
Desired Setup
Armed with this knowledge I have tried to use OleDbConnection and OleDbCommand classes using this stackoverflow question, but I can't see how to use these components to bulk insert data into the DB.
I have also tried to use the visual SSIS component which is called "Bulk Insert Task", but lo luck there either.
How can I possibly insert in bulk using OLE DB?
I have an Excel .xlsx file. I want to read data from the file and write data back to the file; no graphics, equations, images, just data.
I tried connecting using the types at System.Data.OleDb:
using System.Data.OleDb;
var fileName = #"C:\ExcelFile.xlsx";
var connectionString =
"Provider=Microsoft.ACE.OLEDB.12.0;" +
$"Data Source={fileName};" +
"Extended Properties=\"Excel 12.0;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\"";
using var conn = new OleDbConnection(connectionString);
conn.Open();
but I get the following error:
The 'Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine.
I know that I can install the Microsoft Access Database Engine 2016 Redistributable, but I want to do this without installing additional software.
How can I do this?
For starters, you may well have the driver already installed, but only for 32-bit programs, while your program is running under 64-bit (or vice versa, but that's less common).
You can force a specific environment in your .csproj file. To force 32-bit, use:
<PropertyGroup>
<PlatformTarget>x86</PlatformTarget>
</PropertyGroup>
and to force 64-bit:
<PropertyGroup>
<PlatformTarget>x64</PlatformTarget>
</PropertyGroup>
If the driver has been installed in the other environment, your code should connect successfully.
NB. You can list the available providers for the current environment using code like the following:
using System.Data;
using System.Data.OleDb;
using System.Linq;
using static System.Console;
var oleEnum = new OleDbEnumerator();
var data =
oleEnum.GetElements()
.Rows
.Cast<DataRow>()
.Select(row => (
name: row["SOURCES_NAME"] as string,
descr: row["SOURCES_DESCRIPTION"] as string
))
.OrderBy(descr => descr);
foreach (var (name, descr) in data) {
WriteLine($"{name,-30}{descr}");
}
What if you don't have the Microsoft.ACE.OLEDB.12.0 provider in either environment? If you can convert your .xlsx file to an .xls file, you could use the Microsoft Jet 4.0 OLE DB Provider, which has been installed in every version of Windows since Windows 2000 (only available for 32-bit).
Set the PlatformTarget to x86 (32-bit) as above. Then, edit the connection string to use the older provider:
using System.Data.OleDb;
var fileName = #"C:\ExcelFile.xls";
// note the changes to Provider and the first value in Extended Properties
var connectionString =
"Provider=Microsoft.Jet.OLEDB.4.0;" +
$"Data Source={fileName};" +
"Extended Properties=\"Excel 8.0;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\"";
using var conn = new OleDbConnection(connectionString);
conn.Open();
Once you have an open OleDbConnection, you can read and write data using the standard ADO .NET command idioms for interacting with a data source:
using System.Data;
using System.Data.OleDb;
var ds = new DataSet();
using (var conn = new OleDbConnection(connectionString)) {
conn.Open();
// assuming the first worksheet is called Sheet1
using (var cmd = conn.CreateCommand()) {
cmd.CommandText = "UPDATE [Sheet1$] SET Field1 = \"AB\" WHERE Field2 = 2";
cmd.ExecuteNonQuery();
}
using (var cmd1 = conn.CreateCommand()) {
cmd1.CommandText = "SELECT * FROM [Sheet1$]";
var adapter = new OleDbDataAdapter(cmd);
adapter.Fill(ds);
}
};
NB. I found that in order to update data I needed to remove the IMEX=1 value from the connection string, per this answer.
If you must use an .xlsx file, and you only need to read data from the Excel file, you could use the ExcelDataReader NuGet package in your project.
Then, you could write code like the following:
using System.IO;
using ExcelDataReader;
// The following line is required on .NET Core / .NET 5+
// see https://github.com/ExcelDataReader/ExcelDataReader#important-note-on-net-core
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
DataSet ds = null;
using (var stream = File.Open(#"C:\ExcelFile.xlsx", FileMode.Open, FileAccess.Read)) {
using var reader = ExcelReaderFactory.CreateReader(stream);
ds = reader.AsDataSet();
}
Another alternative you might consider is to use the Office Open XML SDK, which supports both reading and writing. I think this is a good starting point -- it shows both how to read from a given cell, and how to write information back into a cell.
I have an MVC application which allows an admin user to upload 2 different excel file onto the system; the controller then creates a dataset with the excel data and then populates either a "Schools" or a "School2" database with the dataset using an SqlBulkCopy.
These uploads work perfect when I test them locally using IIS Express, although the same version deployed to AWS elastic beanstalk throws an error when I press the import button. As far as I am aware, this is due to my AWS RDS needing access to the OleDB provider jet drivers; something which I can not do because these drivers can not just be installed on an AWS RDS like they can be on an EC2 instance.
So my plan is to change my upload controller around to accept .csv files instead of excel files. This should solve my problem and allow my upload buttons to work after being deployed on AWS. Could someone help me/point me in the right direction to change my controller to support .csv instead of excel please?
Upload Controller:
namespace CampBookingSys.Controllers
{
public class UploadController : Controller
{
SqlConnection con = new SqlConnection(#"Data Source=bookingdb.cwln7mwjvxdd.eu-west-1.rds.amazonaws.com,1433;Initial Catalog=modeldb;User ID=craig1990;Password=27Oct90!;Database=modeldb;Connect Timeout=30;Encrypt=False;TrustServerCertificate=False;ApplicationIntent=ReadWrite;MultiSubnetFailover=False");
OleDbConnection Econ;
public ActionResult Index()
{
return View();
}
[HttpPost]
public ActionResult Index(HttpPostedFileBase file)
{
string filename = Guid.NewGuid() + Path.GetExtension(file.FileName);
string filepath = "/excelfolder/" + filename;
file.SaveAs(Path.Combine(Server.MapPath("/excelfolder"), filename));
InsertExceldata(filepath, filename);
return View();
}
[HttpPost]
public ActionResult Index2(HttpPostedFileBase file)
{
string filename = Guid.NewGuid() + Path.GetExtension(file.FileName);
string filepath = "/excelfolder/" + filename;
file.SaveAs(Path.Combine(Server.MapPath("/excelfolder"), filename));
InsertExceldata2(filepath, filename);
return RedirectToAction("Index");
}
private void ExcelConn(string filepath)
{
string constr = string.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES;""", filepath);
Econ = new OleDbConnection(constr);
}
private void InsertExceldata(string filepath, string filename)
{
string fullpath = Server.MapPath("/excelfolder/") + filename;
ExcelConn(fullpath);
string query = string.Format("Select * from [{0}]", "Sheet1$");
OleDbCommand Ecom = new OleDbCommand(query, Econ);
Econ.Open();
DataSet ds = new DataSet();
OleDbDataAdapter oda = new OleDbDataAdapter(query, Econ);
Econ.Close();
oda.Fill(ds);
DataTable dt = ds.Tables[0];
SqlBulkCopy objbulk = new SqlBulkCopy(con);
objbulk.DestinationTableName = "dbo.Schools";
objbulk.ColumnMappings.Add("AcademicYear", "AcademicYear");
objbulk.ColumnMappings.Add("RollNumber", "RollNumber");
objbulk.ColumnMappings.Add("OfficialSchoolName", "OfficialSchoolName");
objbulk.ColumnMappings.Add("Address1", "Address1");
objbulk.ColumnMappings.Add("Address2", "Address2");
objbulk.ColumnMappings.Add("Address3", "Address3");
objbulk.ColumnMappings.Add("Address4", "Address4");
objbulk.ColumnMappings.Add("County", "County");
objbulk.ColumnMappings.Add("Eircode", "Eircode");
objbulk.ColumnMappings.Add("LocalAuthority", "LocalAuthority");
objbulk.ColumnMappings.Add("X", "X");
objbulk.ColumnMappings.Add("Y", "Y");
objbulk.ColumnMappings.Add("ITMEast", "ITMEast");
objbulk.ColumnMappings.Add("ITMNorth", "ITMNorth");
objbulk.ColumnMappings.Add("Latitude", "Latitude");
objbulk.ColumnMappings.Add("Longitude", "Longitude");
con.Open();
objbulk.WriteToServer(dt);
con.Close();
}
private void InsertExceldata2(string filepath, string filename)
{
string fullpath = Server.MapPath("/excelfolder/") + filename;
ExcelConn(fullpath);
string query = string.Format("Select * from [{0}]", "Sheet1$");
OleDbCommand Ecom = new OleDbCommand(query, Econ);
Econ.Open();
DataSet ds = new DataSet();
OleDbDataAdapter oda = new OleDbDataAdapter(query, Econ);
Econ.Close();
oda.Fill(ds);
DataTable dt = ds.Tables[0];
SqlBulkCopy objbulk = new SqlBulkCopy(con);
objbulk.DestinationTableName = "dbo.School2";
objbulk.ColumnMappings.Add("RollNumber", "RollNumber");
objbulk.ColumnMappings.Add("OfficialSchoolName", "OfficialSchoolName");
objbulk.ColumnMappings.Add("Address1", "Address1");
objbulk.ColumnMappings.Add("Address2", "Address2");
objbulk.ColumnMappings.Add("Address3", "Address3");
objbulk.ColumnMappings.Add("Address4", "Address4");
objbulk.ColumnMappings.Add("County", "County");
objbulk.ColumnMappings.Add("Eircode", "Eircode");
objbulk.ColumnMappings.Add("PhoneNumber", "PhoneNumber");
objbulk.ColumnMappings.Add("Email", "Email");
objbulk.ColumnMappings.Add("PrincipalName", "PrincipalName");
objbulk.ColumnMappings.Add("DeisSchool", "DeisSchool");
objbulk.ColumnMappings.Add("SchoolGender", "SchoolGender");
objbulk.ColumnMappings.Add("PupilAttendanceType", "PupilAttendanceType");
objbulk.ColumnMappings.Add("IrishClassification", "IrishClassification");
objbulk.ColumnMappings.Add("GaeltachtArea", "GaeltachtArea");
objbulk.ColumnMappings.Add("FeePayingSchool", "FeePayingSchool");
objbulk.ColumnMappings.Add("Religion", "Religion");
objbulk.ColumnMappings.Add("OpenClosedStatus", "OpenClosedStatus");
objbulk.ColumnMappings.Add("TotalGirls", "TotalGirls");
objbulk.ColumnMappings.Add("TotalBoys", "TotalBoys");
objbulk.ColumnMappings.Add("TotalPupils", "TotalPupils");
con.Open();
objbulk.WriteToServer(dt);
con.Close();
}
}
}
My first advice is to do bulk inserts in the DBMS, not in code. Doing them via code is only prone to add addtional issues.
As far as parsing the .xlsx files go, the OleDB driver is propably unessesary. There are a few basic rules for working with office formats:
if you can limit it to the new ones (.xlsx), you can use the OpenXML SDK. Or any of the Wrappers people made around it. Or even just the .ZipArchive and XMLReader classes.
if you need to support the old formats (.xls) too, you got to use the (t)rusty Office COM Interop. This has all the usual issue of COM Interop, and aditionally needs office installed and a Interactive session
for any given Display Technology and Problem, there might be a 3rd option. But those are few and far in between. As we always got the Interop to fall back on, we never developed a complete Office processing class like so many other languages have.
I would put OlebDB in that last category - a rare and very specific solution.
It always advise for using the first option.
And Officer COM Interop should be silently burried, with the old formats being removed from the file format options. Considering this is a WebApplication that will likely run as services, you will not get the nesseary interactive session anyway.
Of course accepting .csv is also an option. And indeed Excel has full .CSV support.
I have a SSIS package with script task. c# script use ACE Oledb 12.0 provider to connect to excel file. The question is, how to connect to excel file in read-only mode (if someone open the file, my script should not have an error - it should work). The code, I tried here:
string fileToTest = Dts.Variables["User::FileName"].Value.ToString();
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source=" + fileToTest + #";Extended Properties=""Excel 8.0;READONLY=1""";
OleDbConnection excelConnection = new OleDbConnection(connectionString);
excelConnection.Open();
string sqlQuery = "SELECT * FROM [SheetName$A1:FZ1000]";
OleDbDataAdapter dataAdt = new OleDbDataAdapter(sqlQuery, excelConnection);
DataSet dataSt = new DataSet();
dataAdt.Fill(dataSt, "TblName1");
DataTable dataTbl = dataSt.Tables["TblName1"];
I receive oledbexception, if someone open the file.
Use google to search for that.
I searched for: "microsoft.ace.oledb.12.0 read only"
https://social.msdn.microsoft.com/Forums/office/en-US/498cd52a-b0ee-4c8d-8943-2b76055b4130/oledbconnection-read-only-mode?forum=accessdev
It looks like you can add to the connection string.
From that page:
Actually, with an OleDbConnection (assuming .net here). You can specify a read only mode in your connection string of the OleDbConnection. The following connection string will prevent you from changing data in your datasource:
const string cnnString = "Provider=Microsoft.ACE.OLEDB.12.0"
+ ";Mode=Read"
+ #";Data Source=|DataDirectory|\Northwind 2010.accdb";
It looks like adding ;Mode=Read to the connection string should do the trick.
I'm creating an application where the user clicks on a button, browses for an Excel file, and the data is copied into the data table created in the database.
I am using VS2008 and SQL Server 2005.
I wrote code for opening the file of course, and created a dataTable and its dataColumns in the .cs file. What else should I do?
Thank you.
You could do as Krishna advises... but that code will only work if the columns in the excel file and Database columns are the same in number and order. For ease of maintainability and mapping down the line I highly recommend you use a combination of Linq to Excel and Linq to SQL as in this article:
http://solidcoding.blogspot.com/2008/01/linq-to-excel-sql-import.html
You can write like the code below to dump the data into the SQL Server table
string excelConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" + Server.MapPath("ImportFile.xls") + ";Extended Properties=""Excel 8.0;HDR=Yes;""";
using (OleDbConnection connection = new OleDbConnection(excelConnectionString))
{
OleDbCommand command = new OleDbCommand("Select * FROM [NameOfTheDataSheet$]", connection);
connection.Open();
using (DbDataReader dataReader = command.ExecuteReader())
{
string sqlConnectionString = "SQL Connection String";
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(sqlConnectionString))
{
bulkCopy.DestinationTableName = "ExcelDataTable";
bulkCopy.WriteToServer(dataReader);
}
}
}
Let me know if you have different requirements.