GetOleDbSchemaTable columns slow for large worksheet - c#

I am using the ACE OLEDB connection string to connect to an excel file. I've noticed my query (see example below) that returns the column schema takes longer to run when the worksheet has more rows of data on it.
For some of my larger worksheets (200k rows) it is taking around 10 seconds for the header schema to be returned. It there a way to speed this up or a better way to get the column headers?
string connectionString = string.Empty;
connectionString = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};
Extended Properties=""Excel 12.0 Xml;HDR=YES;IMEX=1""", path);
OleDbConnection con = new OleDbConnection(connectionString);
con.Open();
DataTable dtSchema = new DataTable();
System.Diagnostics.Debug.WriteLine("Start: " + DateTime.Now.ToLongTimeString());
dtSchema = con.GetOleDbSchemaTable(OleDbSchemaGuid.Columns,
new Object[] { null, null, WorksheetName, null });
System.Diagnostics.Debug.WriteLine("End: " + DateTime.Now.ToLongTimeString());
con.Close();
UPDATE
I tried rewriting this - turning Headers off and manually reading only the first row. It still takes around 10 seconds to process on my larger files (small ones still come back very quickly). Is there anything else I can try that might be able to get the header(first row) values quicker?
string connectionString = string.Empty;
connectionString = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0; Data Source={0};
Extended Properties=""Excel 12.0 Xml;HDR=NO;""", path);
DataTable dtSchema = new DataTable();
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
OleDbCommand command = new OleDbCommand(String.Format("SELECT * FROM [{0}A1:II1]", WorksheetName),conn);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter();
dataAdapter.SelectCommand = command;
DataSet dataSet = new DataSet();
dataAdapter.Fill(dataSet);
dtSchema = dataSet.Tables[0];
}

Related

Webform only reads headers from a group Excel Files due to the Tab Name

I am building an app to read measurement data (saved as .xls files on our network) from comparators throughout our facility via a Webform. When I query this group of files I only retrieve the headers. My code (pretty standard) will read any other Excel file I can find. I tried to remove a file from the network and save it locally, as well as rename and save it from office 2016. Note - these are .xls files and I can't change that.
** - I just discovered that there are named ranges with the same name as the file. The results I am getting are just the values in the named range and not the data from the workbook.
Here is an example of how they are named (autogenerated) "R87_1RCR0009654S_COIN"
If I rename or remove the named range this works fine.
Is there a way I can change my select statement to read these?
Here is a code sample, not sure there if there is a change I can make here to read these files.
private string Excel03ConString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties='Excel 8.0;HDR={1}'";
private string Excel07ConString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 8.0;HDR={1}'";
string conStr, sheetName;
conStr = string.Format(Excel03ConString, info.FullName, "YES");
string fullPathToExcel = info.FullName;
//Get the name of the First Sheet.
using (OleDbConnection con = new OleDbConnection(conStr))
{
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = con;
con.Open();
DataTable dtExcelSchema = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
sheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
con.Close();
}
}
using (OleDbConnection con = new OleDbConnection(conStr))
{
using (OleDbCommand cmd = new OleDbCommand())
{
using (OleDbDataAdapter oda = new OleDbDataAdapter())
{
DataTable dt = new DataTable();
cmd.CommandText = "SELECT * From [" + sheetName + "]";
cmd.Connection = con;
con.Open();
oda.SelectCommand = cmd;
oda.Fill(dt);
con.Close();
}
}
}

Retrieve Column By Header Name

I am using OLEDB to read the data from an Excel spreadsheet.
var connectionString =
string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=Excel 12.0;", fileName);
var adapter = new OleDbDataAdapter("SELECT * FROM [sheet1$]", connectionString);
var ds = new DataSet();
adapter.Fill(ds, "mySheet");
var data = ds.Tables["mySheet"].AsEnumerable();
foreach (var dataRow in data)
{
Console.WriteLine(dataRow[0].ToString());
}
Instead of passing an index to the DataRow to get the value of a column, is there anyway to retrieve the column by the name of the column header?
Try this code:
var connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=Excel 12.0; HDR=YES", fileName);
var adapter = new OleDbDataAdapter("SELECT * FROM [sheet1$]", connectionString);
var ds = new DataSet();
adapter.Fill(ds, "mySheet");
var data = ds.Tables["mySheet"].AsEnumerable();
foreach (DataRow dataRow in data)
{
Console.WriteLine(dataRow["MyColumnName"].ToString());
Console.WriteLine(dataRow.Field<string>("MyColumnName").ToString());
}
I added in 2 ways to access the data in the row via column Name.
Hope this does the trick!!
Modify your connection string to specify that you have headers in your excel file.
You can do this by setting the HDR value.
Refer this link to for various variations of connection string and build the one that suits your needs"
http://www.connectionstrings.com/excel/
Use a DataTable to have your data.
string strConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + **EXCEL FILE PATH** + ";Extended Properties=\"Excel 8.0;HDR=YES;IMEX=1\"";
OleDbConnection conn = new OleDbConnection(strConn);
conn.Open();
OleDbCommand cmd2 = new OleDbCommand("SELECT * FROM [**YOUR SHEET** $]", conn);
cmd2.CommandType = CommandType.Text;
DataTable outputTable2 = new DataTable("myDataTable");
new OleDbDataAdapter(cmd2).Fill(outputTable2);
foreach(Datarow row in outputTable2)
{
String s = row["yourcolumnheader"].ToString();
}

How to read and get data from excel .xlsx

I have excel file with 2 tables. I need to read this tables and get all the values from this tables. But all for what I have is:
OleDbConnection cnn = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\MigrateExelSql\Include\TestDb.xlsx; Extended Properties=Excel 12.0;");
OleDbCommand oconn = new OleDbCommand("select * from [Sheet1$]", cnn);
cnn.Open();
OleDbDataAdapter adp = new OleDbDataAdapter(oconn);
DataTable dt = new DataTable();
adp.Fill(dt);
And I don't uderstand what I need to write for get the all values from Username and Email tables. Here is the .xlsx table TestDb Please can somebody help me, because I'm googling the second day and I have no idea for what I must to do.
And when I try to get values by this method it return me an error:
var fileName = string.Format("{0}\\Include\\TestDb.xlsx", Directory.GetCurrentDirectory());
var connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=Excel 12.0;", fileName);
var adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connectionString);
var ds = new DataSet();
adapter.Fill(ds, "Username");
var data = ds.Tables["Username"].AsEnumerable();
foreach (var item in data)
{
Console.WriteLine(item);
}
Console.ReadKey();
One more Edit:
string con =
#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\MigrateExelSql\Include\TestDb.xlsx; Extended Properties=Excel 12.0;";
using(OleDbConnection connection = new OleDbConnection(con))
{
connection.Open();
OleDbCommand command = new OleDbCommand("select * from [Sheet1$]", connection);
using(OleDbDataReader dr = command.ExecuteReader())
{
while(dr.Read())
{
var row1Col0 = dr[0];
Console.WriteLine(row1Col0);
}
}
}
Console.ReadKey();
This will read only first column, but when I try to read dr[1] it will return error: Index was outside bound of the array.
Your xlsx file contains only one sheet and in that sheet there is only one column.
A sheet is treated by the OleDb driver like a datatable and each column in a sheet is considered a datacolumn.
You can't read anything apart one table (Sheet1$) and one column (dr[0]).
If you try to read dr[1] then you are referencing the second column and that column doesn't exist in Sheet1.
Just to test, try to add some values in the second column of the Excel file.
Now you can reference dr[1].

Import data from excel to mysql using c#

I have Excel file shown bellow
I want to read 1st read only all school names & school address & insert them in SchoolInfo table of mySql database.
After that I want to read data for each school & insert it in StudentsInfo table which has foreign key associated with SchoolInfo table.
I am reading excel sheet something like this.
public static void Import(string fileName)
{
string strConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName +
";Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\"";
var output = new DataSet();
using (var conn = new OleDbConnection(strConn))
{
conn.Open();
var dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
if (dt != null)
foreach (DataRow row in dt.Rows)
{
string sheet = row["TABLE_NAME"].ToString();
var cmd = new OleDbCommand("SELECT * FROM [+"+sheet+"+]", conn);
cmd.CommandType = CommandType.Text;
OleDbDataAdapter xlAdapter = new OleDbDataAdapter(cmd);
xlAdapter.Fill(output,"School");
}
}
}
Now I am having data in datatable of dataset, Now how do I read desired data & insert it in my sql table.
Try the following steps:
Reading from Excel sheet
First you must create an OleDB connection to the Excel file.
String connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + path + ";" +
"Extended Properties=Excel 8.0;";
OleDbConnection xlConn = new OleDbConnection(connectionString);
xlConn.Open();
Here path refers to the location of your Excel spreadsheet. E.g. "D:\abc.xls"
Then you have to query your table. For this we must define names for the table and columns first. Click here to find out how.
Now create the command object.
OleDbCommand selectCmd = new OleDbCommand("SELECT * FROM [Sheet1$]", xlConn);
Now we have to store the ouput of the select command into a DataSet using a DataAdapter
OleDbDataAdapter xlAdapter = new OleDbDataAdapter();
objAdapter1.SelectCommand = selectCmd;
DataSet xlDataset = new DataSet();
xlAdapter.Fill(xlDataset, "XLData");
Save the data into variables
Now extract cell data into variables iteratively for the whole table by using
variable = xlDataset.Tables[0].Rows[row_value][column_value].ToString() ;
Write the data from the variables into the MySQL database
Now connect to the MySQL database using an ODBC connection
String mySqlConnectionString = "driver={MySQL ODBC 5.1 Driver};" +
"server=localhost;" + "database=;" + "user=;" + "password=;";
OdbcConnection mySqlConn = new OdbcConnection(mySqlConnectionString);
mySqlConn.Open();
Construct your INSERT statement using the variables in which data has been stored from the Excel sheet.
OdbcCommand mySqlInsert = new OdbcCommand(insertCommand, mySqlConn);
mySqlInsert.ExecuteScalar()

Reading Excel ROW using OleDb data retrieval

Any help what I am doing wrong here? if I am trying to read the only row for an example TestCaseName to case_1 then i'm getting the data of different row.
How can I make sure its only read what is being requested to read? and I am using the where clause but seems like does not filter it.
string connectionString = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 8.0;HDR=YES;IMEX=1;""", EXCELFILENAME);
string testCaseName = "case_1
string query = String.Format("SELECT * from [{0}$] WHERE TestCaseName=\"{1}\"", workbookName, testCaseName);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(query, connectionString);
DataSet dataSet = new DataSet();
dataAdapter.Fill(dataSet);
DataTable myTable = dataSet.Tables[0];
TestCaseName Name Active Status etc...
----------------------------------------------------------------
case_1 Tom yes Completed etc...
----------------------------------------------------------------
case_2 John yes etc...
----------------------------------------------------------------
case_3 Jim yes etc...
----------------------------------------------------------------
case_4 Don yes etc...
----------------------------------------------------------------
case_5 Sam yes Visitor etc...
----------------------------------------------------------------
Here's the code I tested with. It appears to work perfectly. If your doesn't then I can only assume that your spreadsheet is not structured quite the same as mine.
string connectionString = "Provider=Microsoft.Ace.OLEDB.12.0;Data Source=" + filename + ";Extended Properties=\"Excel 8.0;HDR=YES;IMEX=1\"";
string testCaseName = "case_1";
string query = "SELECT * from [Sheet1$] WHERE TestCaseName=\"" + testCaseName + "\"";
DataTable dt = new DataTable();
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
conn.Open();
using (OleDbDataAdapter dataAdapter = new OleDbDataAdapter(query, conn))
{
DataSet ds = new DataSet();
dataAdapter.Fill(ds);
dt = ds.Tables[0];
}
conn.Close();
}

Categories

Resources