Reading multiple Excel worksheets inside a single xlsx using c# [duplicate] - c#

This question already has answers here:
Reading multiple excel sheets with different worksheet names
(2 answers)
Closed 8 years ago.
Using c# I can successfully open an excel document and read the data in the first worksheet with the code below. However, my .xlsx has multiple worksheets so I would like to loop through the worksheet collection rather than hard coding the name of each worksheet. Many thanks.
string path = #"C:\Extract\Extract.xlsx";
string connStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties=Excel 12.0;";
string sql = "SELECT * FROM [Sheet1$]";
using (OleDbDataAdapter adaptor = new OleDbDataAdapter(sql, connStr))
{
DataSet ds = new DataSet();
adaptor.Fill(ds);
DataTable dt = ds.Tables[0];
}

I used most of the code in the answer here [Reading multiple excel sheets with different worksheet names that was kindly pointed out to me in a comment on my question.
It wouldn't compile for me in VS 2013 though as the DataRow object does not have have the property Item (- r.Item(0).ToString in that code). So I just changed that little bit. It also brought back some worksheet that had Print_Area in its name which wasn't valid so I took that out of my loop. Here is the code as it worked for me.
string path = #"C:\Extract\Extract.xlsx";
string connStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties=Excel 12.0;";
DataTable sheets = GetSchemaTable(connStr);
string sql = string.Empty;
DataSet ds = new DataSet();
foreach (DataRow dr in sheets.Rows)
{ //Print_Area
string WorkSheetName = dr["TABLE_NAME"].ToString().Trim();
if (!WorkSheetName.Contains("Print_Area"))
{
sql = "SELECT * FROM [" + WorkSheetName + "]";
ds.Clear();
OleDbDataAdapter data = new OleDbDataAdapter(sql, connStr);
data.Fill(ds);
DataTable dt1 = ds.Tables[0];
foreach (DataRow dr1 in dt1.Rows)
{
//parsing work
}
}
}
static DataTable GetSchemaTable(string connectionString)
{
using (OleDbConnection connection = new
OleDbConnection(connectionString))
{
connection.Open();
DataTable schemaTable = connection.GetOleDbSchemaTable(
OleDbSchemaGuid.Tables,
new object[] { null, null, null, "TABLE" });
return schemaTable;
}
}

I'm about to work on almost the same problem.
I found the guide at http://www.dotnetperls.com/excel quite useful.
In short, to open worksheet no. 3, add the following code after opening the excel workbook:
var worksheet = workbook.Worksheets[3] as
Microsoft.Office.Interop.Excel.Worksheet;
Hope this answered your question.

I'd recommend using EPPlus (available via Nuget https://www.nuget.org/packages/EPPlus/ ) it's a great wrapper tool for working with .xlsx spreadsheets in .Net .In it worksheets are a collection and so you can do what you want by just looping round them, regardless of name or index.
For example,
using (ExcelPackage package = new ExcelPackage(new FileInfo(sourceFilePath)))
{
foreach (var excelWorksheet in package.Workbook.Worksheets)
...
}

You should try the Open XML Format SDK (Nuget: Link). The link below explains both reading and writing Excel documents:
http://www.codeproject.com/Articles/670141/Read-and-Write-Microsoft-Excel-with-Open-XML-SDK
Oh by the way, office doesn't have to be installed to use...

Related

how to Read xls file using OLEDB?

I want to read all data from an xls file using OLEDB, but I don't have any experience in that.
string filename = #"C:\Users\sasa\Downloads\user-account-creation_2.xls";
string connString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filename + ";Extended Properties='Excel 8.0;HDR=YES'";
using (System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection(connString))
{
conn.Open();
System.Data.OleDb.OleDbCommand selectCommand = new System.Data.OleDb.OleDbCommand("select * from [Sheet1$]", conn);
System.Data.OleDb.OleDbDataAdapter adapter = new System.Data.OleDb.OleDbDataAdapter(selectCommand);
DataTable dt = new DataTable();
adapter.Fill(dt);
int counter = 0;
foreach (DataRow row in dt.Rows)
{
String dataA = row["email"].ToString();
// String dataB= row["DataB"].ToString();
Console.WriteLine(dataA + " = ");
counter++;
if (counter >= 40) break;
}
}
I want to read all data from email row
I get this error
'Sheet$' is not a valid name. Make sure that it does not include invalid characters or punctuation and that it is not too long
Well, you don't have a sheet called Sheet1 do you? Your sheet seems to be called "email address from username" so your query should be....
Select * From ['email address from username$']
Also please don't use Microsoft.Jet.OLEDB.4.0 as it's pretty much obsolete now. Use Microsoft.ACE.OLEDB.12.0. If you specify Excel 12.0 in the extended properties it will open both .xls and .xlsx files.
You can also load the DataTable with a single line...
dt.Load(new System.Data.OleDb.OleDbCommand("Select * From ['email address from username$']", conn).ExecuteReader());
To read the names of the tables in the file use...
DataTable dtTablesList = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drTable in dtTablesList.Rows)
{
//Do Something
//But be careful as this will also return Defined Names. i.e ranges created using the Defined Name functionality
//Actual Sheet names end with $ or $'
if (drTable["Table_Name"].ToString().EndsWith("$") || drTable["Table_Name"].ToString().EndsWith("$'"))
{
Console.WriteLine(drTable["Table_Name"]);
}
}
Is it possible to use the Open XML SDK?
https://learn.microsoft.com/en-us/office/open-xml/how-to-retrieve-the-values-of-cells-in-a-spreadsheet

How to insert data from excel to datatable which has headers?

I am reading excel having like million of records first i query my table (no records) and get Datable. i query my table to get columns name as define in my excel sheet using alias.
var dal = new clsConn();
var sqlQuery = "SELECT FETAPE_THEIR_TRANDATE \"Date\" ,ISSUER Issuer, ISSU_BRAN Branch , STAN_NUMB STAN, TERMID TermID, ACQUIRER Acquirer,DEBIT_AMOUNT Debit,CREDIT_AMOUNT Credit,CARD_NUMB \"Card Number\" , DESCRIPTION Description FROM ALLTRANSACTIONS";
var returntable = dal.ReadData(sqlQuery);
DataRow ds = returntable.NewRow();
var dtExcelData = returntable;
So my datatable looks like this,
Then i read records from excel sheet
OleDbConnection con = null;
if (ext == ".xls")
{
con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filepath + ";Extended Properties=Excel 8.0;");
}
else if (ext == ".xlsx")
{
con = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;IMEX=2;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\"");
}
con.Open();
dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string getExcelSheetName = dt.Rows[0]["Table_Name"].ToString();
//OleDbCommand ExcelCommand = new OleDbCommand(#"SELECT * FROM [" + getExcelSheetName + #"]", con);
OleDbCommand ExcelCommand = new OleDbCommand("SELECT F1, F2, F3, F4, F5,F6,F7,F8,F9,F10 FROM [Sheet1$]", con);
OleDbDataAdapter ExcelAdapter = new OleDbDataAdapter(ExcelCommand);
try
{
ExcelAdapter.Fill(dtExcelData); //Here I give the datatable which i made previously
}
catch (Exception ex)
{
//lblAlert2.CssClass = "message-error";
//lblAlert2.Text = ex.Message;
}
It reads successfully and fill data in datatable but creating its own column in data table like F1 to F10 how can i move this data to exactly match with my defined columns in datatable
How Will i manage this to not create other columns (f1,f2..f10)
any workaround will be appreciable or Please explain what i am doing wrong and how can i achieve this.
UPDATE :
My Excel file looks like this
The Microsoft.ACE.OLEDB.12.0 driver will handle both types of excel spreadsheets and using the same Extended Properties. i.e. "Excel 12.0" will open both .xls and .xlsx.
Leave the HDR=NO as OLEDB expects them in the first row and they are actually in row 11.
Sadly "TypeGuessRows=0;ImportMixedTypes=Text" is completely ignored by Microsoft.ACE.OLEDB.12.0, you've got to play around with the registry (yuk). Change your IMEX=2 to IMEX=1 to ensure that mixed data types as handled as text.
Change back to using "Select * From [Sheet1$]" and then I'm afriad that you are going to have to handle the source data manually.
OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=NO\"");
con.Open();
DataTable dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string getExcelSheetName = (string)dt.Rows[0]["Table_Name"];
DataTable xlWorksheet = new DataTable();
xlWorksheet.Load(new OleDbCommand("Select * From [" + getExcelSheetName + "]", con).ExecuteReader());
//More than 11 rows implies 11 header rows and at least 1 data row
if (xlWorksheet.Rows.Count > 11 & xlWorksheet.Columns.Count >= 10)
{
for (int nRow = 11; nRow < xlWorksheet.Rows.Count; nRow++)
{
DataRow returnRow = returntable.NewRow();
for (int nColumn = 0; nColumn < 10; nColumn++)
{
//Note you will probably get conversion problems here that you will have to handle
returnRow[nColumn] = xlWorksheet.Rows[nRow].ItemArray[nColumn];
}
returntable.Rows.Add(returnRow);
}
}
I'm guessing you simply want to add the excel data into your ALLTRANSACTION table? You don't specify but it seems the likely outcome of this. If so this is a terrible way to do it. You don't need to read the whole table into memory append data and then update the database. All you need to do is read the excel file and insert the data to the Oracle table.
Some thoughts, your returntable will contain data so if you just want the structure of the table then add a "Where RowNum=0" to the Select statement. To add the data to your Oracle Database you could 1) Convert to using the Oracle Data Provider (ODP) and then use using OracleBulkCopy Class or 2) simply modify the above to insert row by row as you read the data. As long as you don't have a LOT of data in your Excel spreadsheet it will work just fine. Having said that a Million rows is a LOT so perhaps not the best option. You will need to validate the input as Excel is not the best data source really.

c# Import excel file

I importing excel to datatable in my asp.net project.
I have below code:
string excelConString = string.Format(
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" +
"Extended Properties='Excel 8.0;" +
"IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text;'", filepath);
using (OleDbConnection connection = new OleDbConnection(excelConString))
{
connection.Open();
string worksheet;
worksheet = "Sheet 1$";
string connStr;
connStr = string.Format("Select * FROM `{0}`", worksheet);
OleDbDataAdapter daSheet = new OleDbDataAdapter(connStr, connection);
DataSet dataset = new DataSet();
DataTable table;
table = new DataTable();
daSheet.Fill(table);
dataset.Tables.Add(table);
connStr = string.Format("Select * FROM `{0}$`", worksheet);
table = new DataTable();
daSheet.Fill(table);
dataset.Tables.Add(table);
}
When i run above code in order to import excel, last data always missing because last data has special character like below
"İ,Ö,Ş" etc.
How can i solve this problem.I added below code
"IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text;
however it is not working for me.
Any help will be appreciated.
Thank you
Just answer this question for other readers if any.
If you prefer to handle with POCO directly with Excel file, recommend to use my tool Npoi.Mapper, a convention based mapper between strong typed object and Excel data via NPOI.
Get objects from Excel (XLS or XLSX)
var mapper = new Mapper("Book1.xlsx");
var objs1 = mapper.Take<SampleClass>("sheet2");
// You can take objects from the same sheet with different type.
var objs2 = mapper.Take<AnotherClass>("sheet2");
Export objects to Excel (XLS or XLSX)
//var objects = ...
var mapper = new Mapper();
mapper.Save("test.xlsx", objects, "newSheet", overwrite: false);
Put different types of objects into memory workbook and export together.
var mapper = new Mapper("Book1.xlsx");
mapper.Put(products, "sheet1", true);
mapper.Put(orders, "sheet2", false);
mapper.Save("Book1.xlsx");

OLEDB read excel cannot process data start with special character apostrophe ( ' )

The following codes read an excel sheet and copy all the data into a C# DataTable
string strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
conn = new OleDbConnection(strConn);
conn.Open();
string strExcel = "";
OleDbDataAdapter myCommand = null;
DataSet ds = null;
strExcel = "select * from [" + Sheet1$+ "]";
myCommand = new OleDbDataAdapter(strExcel, strConn);
ds = new DataSet();
myCommand.Fill(ds, "table1");
DataTable DT = ds.Tables[0];
When I review the data in DataTable DT, I found that the data in excel which start with apostrophe cannot be read (ex. '0010, '0026, ..., etc), i.e. become empty in DataTable DT.
Any suggested solution to solve it?
This is one of the fun things working with Excel vba.
The only way to see if there is an apostrophe is using this syntax
Dim s As String
Cells(1, 2).Value = "'abc"
s = Cells(1, 2).Formula
s = Cells(1, 2).PrefixCharacter
When you step through this vba code it will only show you the apostrophe if you add the prefixCharacter to your output.
So the only solution I know of is to loop through ALL THE CELLS.... replace the apostrophe with something silly like & #8217; (html apostrophe) and then fix it in your database..
And if you feel like you've been slapped in the face, yes VBA does that to you now and then. This is not OLEDB issue but an excel issue

All of a sudden can't open excel file with microsoft.jet.4.0 with c#. I changed nothing at all with the connection string?

So I'm working on this thing for work, that converts an excel list of instructions into a better looking, formatted word document. I've been connecting to the excel document and then storing the file into a datatable for easier access.
I had just finally gotten the borders and stuff right for my word document when i started getting an error:
External table is not in the expected format.
Here is the full connection algorithm:
public static DataTable getWorkSheet(string excelFile =
"C:\\Users\\Mitch\\Dropbox\\Work tools\\Excel for andrew\\Air Compressor PM's.xlsx") {
string connection = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + excelFile
+ ";Extended Properties='Excel 8.0;HDR=YES;'";
string sql = null;
string worksheetName = null;
string[] Headers = new string[4];
DataTable schema = null;
DataTable worksheet = null;
DataSet workbook = new DataSet();
//Preparing and opening connection
OleDbConnection objconn = new OleDbConnection(connection);
objconn.Open();
//getting the schema data table
schema = objconn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
worksheetName = schema.Rows[0]["Table_Name"].ToString();
//Each worksheet will have a varying name, so the name is just called from
//the dataTable.rows array. Can be later modified to use multiple
//worksheets within a workbook.
sql = "SELECT * FROM[" + worksheetName + "]";
//data adapter
OleDbDataAdapter objAdapter = new OleDbDataAdapter();
//pass the sql
objAdapter.SelectCommand = new OleDbCommand(sql, objconn);
//populate the dataset
objAdapter.Fill(workbook);
//Remove spaces from the headers.
worksheet = workbook.Tables[0];
for (int x = 0; x < Headers.Count(); x++) {
Headers[x] = worksheet.Columns[x].ColumnName;
worksheet.Columns[x].ColumnName = worksheet.Columns[x].ColumnName.Replace(" ", "");
}
return worksheet;
}//end of getWorksheet
EDIT: i pulled up my old code from dropbox previous versions that was definetly working as well as redownload a copy of the excel doc i know was working..... what gives? has something changed in my computer?
You are connecting to a 2007/2010 Excel file (*.xlsx, *.xlsm). You need the updated 2010 drivers (Ace), which can be downloaded for free. The correct connection string can be obtained from http://connectionstrings.com/Excel and http://connectionstrings.com/Excel-2007

Categories

Resources