Read excel file data in batches - c#

I have an excel file that contains about 1.5 million records. My intention is to read first 100 000 records from the excel file into my datatable (here c# datatable i.e dt) do some processing on these records, then read the next 100000 records and so on until I have fetched all the records in excel file.(each time only 100000 rows)
Currently I am fetching all records using below code
public bool ReadDataFile(string filePath)
{
string strConnString = null;
string sheetName = null;
string ErrSheetName = null;
DataSet dsEx = new DataSet();
strConnString = "Provider=Microsoft.Ace.OLEDB.12.0;Data Source=" + filePath + ";Mode=ReadWrite;Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\"";
OleDbConnection conn = new OleDbConnection(strConnString);
try
{
conn.Open();
DataTable DtSheetName = new DataTable();
DataTable dt = new DataTable();
DtSheetName = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
for (int i = 0; i <= DtSheetName.Rows.Count - 1; i++)
{
sheetName = DtSheetName.Rows[i]["TABLE_NAME"].ToString();
if (sheetName.Length >= 4)
{
conn.Close();
conn.Open();
using (OleDbCommand cmd = new OleDbCommand("SELECT * FROM [" + sheetName + "]", conn))
{
using (OleDbDataAdapter da = new OleDbDataAdapter())
{
da.SelectCommand = cmd;
try
{
da.Fill(dt);
dt.TableName = sheetName.Replace("$", "");
}
catch (Exception ex)
{
ErrSheetName = ErrSheetName + "," + sheetName.Replace("$", "");
}
}
}
}
}
return true;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message, "ReadExcel()");
return false;
}
finally
{
if (!string.IsNullOrEmpty(ErrSheetName))
{
MessageBox.Show("Error in Data Of these files [" + ErrSheetName + "] Some Data may be lost !", "ReadExcel()");
}
conn.Close();
GC.Collect();
}
}

Related

Importing excel file into SQL with different schema (asp.net C#)

I'm building a program where the user imports an excel file to a database (SQLServer)... But I Do not want to specify the columns name cause it makes my program very limit, to those column names....
I don't know how to work with datatable and rows very well, but I think its the only way right? (im a newbie, sorry)
Here's the code Guys:
rotected void Upload_Click(object sender, EventArgs e)
{
string sSheetName;
DataTable dtTablesList = new DataTable();
string excelPath = Server.MapPath("~/Nova pasta/") + Path.GetFileName(FileUpload1.PostedFile.FileName);
string filepath = Server.MapPath("~/Nova pasta/") + Path.GetFileName(FileUpload1.FileName);
string filename = Path.GetFileName(filepath);
FileUpload1.SaveAs(excelPath);
string ext = Path.GetExtension(filename);
String strConnection = #"Data Source=PEDRO-PC\SQLEXPRESS;Initial Catalog=costumizado;Persist Security Info=True;User ID=sa;Password=1234";
string excelConnectionString = #"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1;\"";
OleDbConnection excelConnection = new OleDbConnection(excelConnectionString);
excelConnection.Open();
dtTablesList = excelConnection.GetSchema("Tables");
if (dtTablesList.Rows.Count > 0)
{
sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
for (int j = 0; j < dtTablesList.Rows.Count; j++)
{
for (int i = 0; i < dtTablesList.Columns.Count; i++)
{
Debug.Write(dtTablesList.Columns[i].ColumnName + " ");
Debug.WriteLine(dtTablesList.Rows[j].ItemArray[i]);
}
}
Debug.WriteLine(dtTablesList.Rows.Count);
foreach (DataRow dataRow in dtTablesList.Rows)
{
foreach (var item in dataRow.ItemArray)
{
Debug.WriteLine(item);
}
}
OleDbCommand cmd = new OleDbCommand("Select * from [" + sSheetName + "]", excelConnection);
cmd.ExecuteNonQuery();
}
I am not sure what exactly you are looking for.
In the past I have used a component called EPPlus.
I used this solution to turn a worksheet into a datatable.
See here: Excel to DataTable using EPPlus - excel locked for editing
Edit:
When you have the datatable of your excel worksheet you can do something like this:
using (SqlConnection con = new SqlConnection("connectionstring"))
{
try
{
con.Open();
foreach (DataRow dr in worksheetTable.Rows)
{
using (SqlCommand myCommand = new SqlCommand("insert into myTable (Products, column2) values (#prod, #col2)", con))
{
myCommand.CommandType = CommandType.Text;
myCommand.Parameters.AddWithValue("prod", dr[0]);
myCommand.Parameters.AddWithValue("col2", dr[1]);
int result = myCommand.ExecuteNonQuery();
}
}
}
catch (SqlException ex)
{
//handle errors here
}
catch (Exception ex)
{
//handle errors here
}
finally
{
con.Close();
}
}
(Code might not be perfect, I did not test it)
It depends a lot of the structure of your excel and database.
I hope this helps you.

How can we Validate Column names of Excel sheet with Table values using OLEDB

I want to validate column names of uploaded excel sheet with table definition.
Here i get table definition from database and also i get the column names from excel sheet using OLEDB.
i want to validate where all the column from the table is available in excel columns. Here i get both column names (that is from excel and table (DB)).
Here is the code i tried
//for validating column names
public bool ValidateColumnNames(string filename,DataExchangeDefinition dataExchangeDefinition)
{
string extension = Path.GetExtension(filename);
string connstring = string.Empty;
try
{
switch (extension)
{
case ".xls":
connstring = string.Format(ConfigurationManager.ConnectionStrings["Excel03ConString"].ConnectionString, filename);
break;
case ".xlsx":
connstring = string.Format(ConfigurationManager.ConnectionStrings["Excel07+ConString"].ConnectionString, filename);
break;
}
using (OleDbConnection connExcel = new OleDbConnection(connstring))
{
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = connExcel;
connExcel.Open();
var dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
connExcel.Close();
string firstSheet = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmd.CommandText = "SELECT top 1 * FROM [" + firstSheet + "]";
using (OleDbDataAdapter da = new OleDbDataAdapter(cmd))
{
DataTable HeaderColumns = new DataTable();
da.SelectCommand = cmd;
da.Fill(HeaderColumns);
foreach (DataColumn column in HeaderColumns.Columns)
{
//Here i want to validate the column names
dataExchangeDefinition.FieldName = column.Caption.ToString();
}
}
}
}
}
catch (Exception ex)
{
throw ex;
}
return true;
}
Here is the answer it works fine for me
//for validating column names
public List<DataExchangeDefinition> ValidateColumnNames(string filename, List<DataExchangeDefinition> dataExchangeDefinitionList)
{
DataExchangeDefinition dt = new DataExchangeDefinition();
//List<DataExchangeDefinition> dataexchangedefinitionList = new List<DataExchangeDefinition>();
string extension = Path.GetExtension(filename);
string connstring = string.Empty;
try
{
switch (extension)
{
case ".xls":
connstring = string.Format(ConfigurationManager.ConnectionStrings["Excel03ConString"].ConnectionString, filename);
break;
case ".xlsx":
connstring = string.Format(ConfigurationManager.ConnectionStrings["Excel07+ConString"].ConnectionString, filename);
break;
}
using (OleDbConnection connExcel = new OleDbConnection(connstring))
{
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = connExcel;
connExcel.Open();
var dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); //To get the sheet name
connExcel.Close();
string firstSheet = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmd.CommandText = "SELECT top 1 * FROM [" + firstSheet + "]";
using (OleDbDataAdapter da = new OleDbDataAdapter(cmd))
{
DataTable HeaderColumns = new DataTable();
da.SelectCommand = cmd;
da.Fill(HeaderColumns);
//List<DataColumn> excelColumn = new List<DataColumn>();
//var excelColumn = new List<DataColumn>();
//foreach (DataColumn column in HeaderColumns.Columns)
//{
// excelColumn.Add(column);
//}
foreach (DataExchangeDefinition data in dataExchangeDefinitionList)
//for(int i=0;i<dataExchangeDefinitionList.Count;i++)
{
dt.IsColumnValid = false;
//var result = from excelColumn in HeaderColumns;
foreach (DataColumn column in HeaderColumns.Columns)
{
if (data.FieldCaption == column.Caption)
{
data.IsColumnValid = true;
break;
}
}
}
return dataExchangeDefinitionList;
}
}
}
}
catch (Exception ex)
{
throw ex;
}
//return isColumnValid;
}
}
}

Unable to import excel to DB table in server

I am using the below code to import excel data into DB table. The code works fine in my local and when I move this code to server the import fails in between. Also I don't get any error messages and I get a message data saved successfully.
For Ex: The excel has 75,000 data's and only 13,500 records are getting inserted and the size of the excel file is 5 MB.
Any suggestions about the possible problems?
CS:
protected void btnImportData_Click(object sender, EventArgs e)
{
try
{
string strCS = string.Empty; ;
string strFileType = Path.GetExtension(FileUploadExcel.FileName).ToLower();
string query = "";
lblError.Text = "";
string FileName = string.Empty;
FileName = Path.GetFileName(FileUploadExcel.PostedFile.FileName);
string Extension = Path.GetExtension(FileUploadExcel.PostedFile.FileName);
string FolderPath = ConfigurationManager.AppSettings["FolderPath"];
string path = Path.GetFileName(Server.MapPath(FileUploadExcel.FileName));
System.IO.File.Delete(Server.MapPath(FolderPath) + path);
FileUploadExcel.SaveAs(Server.MapPath(FolderPath) + path);
string filePath = Server.MapPath(FolderPath) + path;
if (strFileType != String.Empty)
{
if (strFileType.Trim() == ".xls")
{
strCS = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filePath + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
}
else if (strFileType.Trim() == ".xlsx")
{
strCS = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
}
else
{
ClientScript.RegisterStartupScript(this.GetType(), "alert", "alert('Please upload the correct file format')", true);
return;
}
try
{
OleDbConnection conn = new OleDbConnection(strCS);
if (conn.State == ConnectionState.Closed)
conn.Open();
System.Data.DataTable dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetname = dt.Rows[0]["Table_Name"].ToString();
query = "SELECT * FROM [" + sheetname + "]";
OleDbCommand cmd = new OleDbCommand(query, conn);
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
DataSet ds = new DataSet();
da.Fill(ds);
if (conn.State == ConnectionState.Open)
{
conn.Close();
conn = null;
}
string strSqlTable = "TABLENAME";
string sclearsql = "delete from " + strSqlTable;
SqlConnection sqlconn = new SqlConnection(strCon);
SqlCommand sqlcmd = new SqlCommand(sclearsql, sqlconn);
sqlconn.Open();
sqlcmd.ExecuteNonQuery();
sqlconn.Close();
OleDbConnection oledbconn = new OleDbConnection(strCS);
oledbconn.Open();
OleDbCommand oledbcmd = new OleDbCommand(query, oledbconn);
oledbcmd.CommandTimeout = 120;
OleDbDataReader dReader = oledbcmd.ExecuteReader();
SqlBulkCopy bulkCopy = new SqlBulkCopy(strCon);
bulkCopy.DestinationTableName = strSqlTable;
bulkCopy.BulkCopyTimeout = 120;
bulkCopy.BatchSize = 1000;
bulkCopy.WriteToServer(dReader);
oledbconn.Close();
ClientScript.RegisterStartupScript(Page.GetType(), "alert", "alert('Data saved successfully');window.location='Panel.aspx';", true);
}
catch (Exception ex)
{
lblError.Text = "Upload status: The file could not be uploaded due to following reasons.Please check: " + ex.Message;
}
}
else
{
ClientScript.RegisterStartupScript(this.GetType(), "alert", "alert('Please select a file to import the data.')", true);
}
}
catch (Exception ex)
{
lblError.Text = "Upload status: The file could not be uploaded due to following reasons.Please check: " + ex.Message;
}
}
Hi remove the data reader on you code which read the excel file twice and use the dataset to fill out your SQL table:
See below:
DataTable dtexcel = new DataTable();
using (OleDbConnection conn = new OleDbConnection(strCS))
{
conn.Open();
DataTable schemaTable = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
//Looping a first Sheet of Xl File
DataRow schemaRow = schemaTable.Rows[0];
string sheet = schemaRow["TABLE_NAME"].ToString();
if (!sheet.EndsWith("_"))
{
string query = "SELECT * FROM [" + sheet + "]";
OleDbDataAdapter daexcel = new OleDbDataAdapter(query, conn);
dtexcel.Locale = CultureInfo.CurrentCulture;
daexcel.Fill(dtexcel);
}
}
using (SqlConnection sqlconn = new SqlConnection(strCon))
{
SqlCommand sqlcmd = new SqlCommand(sclearsql, sqlconn);
sqlconn.Open();
sqlcmd.ExecuteNonQuery();
}
using (SqlConnection sqlconn = new SqlConnection(strCon))
{
using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(sqlconn))
{
//Set the database table name
sqlBulkCopy.DestinationTableName = strSqlTable;
sqlconn.Open();
sqlBulkCopy.WriteToServer(dtexcel);
}
}
Have you tried using the dts wizard? it is a nice tool if you have a excel sheet with data.
you use it by opening the command prompt. windowsbutton + r, and the write cmd.
and from in here you write dtswizard
then a wizard opens in a new window where you can select the excel sheet that you want to insert. There is many tutorials on google.
Hope you can use this

External table is not in expected format

I have to upload a Excel Sheet in document library and have to show all the details of list in gridview. But I am getting the error: "external table is not in expected format":
My code is:
void uploadDocument()
{
try
{
string publicFSdocLibrary = "PublicFSdoc";
if (uploadDoc.HasFile)
{
SPSecurity.RunWithElevatedPrivileges(delegate()
{
using (SPSite oSite = new SPSite("http://chdsez301298d:1000/sites/Test/"))
//using (SPSite oSite = new SPSite(SPContext.Current.Site.ID))
{
using (SPWeb myWeb = oSite.OpenWeb())
{
SPListItem lstItem = SPContext.Current.ListItem;
myWeb.AllowUnsafeUpdates = true;
//SPList docList = myWeb.Lists[docLibrary];
SPFolder finalDocument = myWeb.Folders[publicFSdocLibrary];
// Prepare to upload
Boolean replaceExistingFiles = true;
string fileName = Path.GetFileName(uploadDoc.FileName);
filePath = Path.GetFullPath(uploadDoc.PostedFile.FileName);
// string newFilePath = getFilePath(filePath);
FileStream fileStream = System.IO.File.OpenRead(filePath);
// Upload document
SPFile spfile = finalDocument.Files.Add(getFileName(), fileStream, replaceExistingFiles);
SPListItem item = spfile.Item;
item["RequestNo"] = generateRequestNo();
item["AssessmentType"] = "Financial Solvency";
// Commit all changes
item.Update();
//Update document url to Assessment request list
myWeb.AllowUnsafeUpdates = false;
}
}
});
}
}
catch (Exception ex)
{
ex.Message.ToString();
}
}
string getFileName()
{
string newFileName = generateRequestNo() + ".xlsx";
return newFileName;
}
void btnUpload_Click(object sender, EventArgs e)
{
uploadDocument();
getExcelData();
//savePublicFS();
//addAssesmentWfRequest();
//CloseForm();
//Page.ClientScript.RegisterStartupScript(this.GetType(), "close", "window.close();", true);
}
void getExcelData()
{
try
{
if (uploadDoc.HasFile)
{
OleDbConnection conn = new OleDbConnection();
OleDbCommand cmd = new OleDbCommand();
OleDbDataAdapter da = new OleDbDataAdapter();
DataSet ds = new DataSet();
//string sheetName = "Sourcing Questionnaire";
string query = null;
string connString = "";
string fileName = Path.GetFileName(uploadDoc.FileName);
string strFileType = System.IO.Path.GetExtension(uploadDoc.FileName).ToString().ToLower();
string strNewPath = filePath;
//Connection String to Excel Workbook
if (strFileType.Trim() == ".xls")
{
connString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strNewPath + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
}
else if (strFileType.Trim() == ".xlsx")
{
connString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + strNewPath + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
}
// query = "SELECT * FROM [" + sheetName + "$]";
query = "SELECT * FROM [abc$]";
//Create the connection object
conn = new OleDbConnection(connString);
//Open connection
if (conn.State == ConnectionState.Closed) conn.Open();
//Create the command object
cmd = new OleDbCommand(query, conn);
da = new OleDbDataAdapter(cmd);
ds = new DataSet();
da.Fill(ds);
gvdetails.DataSource = ds.Tables[0];
gvdetails.DataBind();
da.Dispose();
conn.Close();
conn.Dispose();
//return ds;
}
}
catch (Exception)
{
throw;
}
}
private String generateRequestNo()
{
try
{
String requestNo = String.Empty;
if (txtSupplierName.Text.Length <= 4)
{
requestNo = txtSupplierName.Text
+ "-" + requestNumber()
+ DateTime.Today.ToString("ddMMyyyy");
}
else
{
requestNo = txtSupplierName.Text.Substring(0, 4)
+ "-" + requestNumber()
+ DateTime.Today.ToString("ddMMyyyy");
}
return requestNo;
}
catch (Exception)
{
throw;
}
}
private String requestNumber()
{
try
{
String number = String.Empty;
number = "FS-";
return number;
}
catch (Exception)
{
throw;
}
}
I am getting error in getExcelData() method at line cmd = new OleDbCommand(query, conn);
Please help me as I struck here since last week.

How to get sheetname of the uploaded excel file using C#?

I would like to get the sheet name of the uploaded excel file using C# code. The file may be in .xls or .xlsx format. The Code I have used is as follows:
protected void btnGenerateCSV_Click(object sender, EventArgs e)
{
string sourceFile = ExcelFileUpload.PostedFile.FileName;
string worksheetName = ?? //(How to get the first sheetname of the uploaded file)
string strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + sourceFile + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1\"";
if (sourceFile.Contains(".xlsx"))
strConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + sourceFile + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=1\"";
try
{
conn = new OleDbConnection(strConn);
conn.Open();
cmd = new OleDbCommand("SELECT * FROM [" + worksheetName + "$]", conn);
cmd.CommandType = CommandType.Text;
wrtr = new StreamWriter(targetFile);
da = new OleDbDataAdapter(cmd);
DataTable dt = new DataTable();
da.Fill(dt);
for (int x = 0; x < dt.Rows.Count; x++)
{
string rowString = "";
for (int y = 0; y < dt.Columns.Count; y++)
{
rowString += "\"" + dt.Rows[x][y].ToString() + "\",";
}
wrtr.WriteLine(rowString);
}
}
catch (Exception exp)
{
}
finally
{
if (conn.State == ConnectionState.Open)
conn.Close();
conn.Dispose();
cmd.Dispose();
da.Dispose();
wrtr.Close();
wrtr.Dispose();
}
}
(How to get the first sheetname of the uploaded file)
string worksheetName = ??
I use this to get sheet names from a .xlsx file and loop through all the names to read sheets one by one.
OleDbConnection connection = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filename + ";Extended Properties='Excel 12.0 xml;HDR=YES;'");
connection.Open();
DataTable Sheets = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach(DataRow dr in Sheets.Rows)
{
string sht = dr[2].ToString().Replace("'", "");
OleDbDataAdapter dataAdapter = new OleDbDataAdapter("select * from [" + sht + "]", connection);
}
DataTable Sheets = oleConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
for(int i=0;i<Sheets.Rows.Count;i++)
{
string worksheets= Sheets.Rows[i]["TABLE_NAME"].ToString();
string sqlQuery = String.Format("SELECT * FROM [{0}]", worksheets);
}
If the Excel is too big, This code will waste a lot of time in(conn.open()). Use Openxml will be better(use less time),but if the Excel is Open---Using openxml to read will have the exception but oldbhelper wile have no exception. My english is pool , sorry.-----Chinese boy
I use Microsoft excel library Microsoft.Office.Interop.Excel. Then you can use index to get the worksheet name as following.
string path = #"C\Desktop\MyExcel.xlsx" //Path for excel
using Excel = Microsoft.Office.Interop.Excel;
xlAPP = new Excel.Application();
xlAPP.Visible = false;
xlWbk = xlAPP.Workbooks.Open(path);
string worksheetName = xlWbk.Worksheets.get_Item(1).Name //pass Index here. Reemember that index starts from 1.
xlAPP.Quit();
releaseObject(xlWbk);
releaseObject(xlAPP);
//Always handle unmanaged code.
private void releaseObject(object obj)
{
try
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(obj);
obj = null;
}
catch (Exception ex)
{
obj = null;
MessageBox.Show("Unable to release the Object " + ex.ToString());
}
finally
{
GC.Collect();
}
}

Categories

Resources