Reading Excel with OleDbDataReader - cannot read values from a specific column - c#

I'm working on an existing C# code reading Excel file with OleDbDataReader. But I can't have the content for the cells in two specific columns.
This is the connection code:
connection = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source="
+ pathExcel + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\";");
connection.Open();
And to access the content of the default sheet:
tables = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
sheetsNames = from DataRow row in tables.Rows select row["TABLE_NAME"].ToString();
sql = "SELECT * FROM [" + sheetsNames.FirstOrDefault() + "];";
ocmd = new OleDbCommand(sql, connection);
reader = ocmd.ExecuteReader(); //OleDbDataReader
So, finally I read all the content, but for some columns I can't access cells content (reader["mycolumn"]). So, I tried this:
while (reader.Read()){
// Test code, I tried different ways to read cell content
// It's working
string colName = reader.GetName(26);
string val1 = reader[colName].ToString();
string val2 = reader.GetValue(26).ToString();
// Same code, changing index 26 to 27
... // always empty values. Bug ??
}
If I evaluate expressions "reader.GetValue(26)" it returns the expected value, but when it's "reader.GetValue(27)" it's returns an exception ("This expression causes side effects and will not be evaluated"), in particular it's like an index out of range exception. But I can read data from next columns (29, 30...).
Do you have any idea about the cause ?

Related

Data is missing while reading excel file using OLEDB

I am using OLEDB to read excel file into datatable. But the problem is, some values are missing(Empty). In my excel sheet one column datatype is General, it has mixed values like string and integers. Most of the cell values are integers. Why OLEDB is skipping string values.
OleDbConnection connection = new OleDbConnection();
connection.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + "; Extended Properties=\"Excel 12.0;IMEX=1\";";
OleDbCommand myAccessCommand = new OleDbCommand();
myAccessCommand.CommandText = "Select * from [" + sheetName + "]";
OleDbDataAdapter myDataAdapter = new OleDbDataAdapter(myAccessCommand);
myDataAdapter.Fill(myDataSet);
Check following link and see points under "RESOLUTION":
http://support.microsoft.com/kb/194124
Please see point 2 NOTE.
Setting IMEX=1 is entirely dependent on your registry settings. By default, first 8 rows are checked to determine the data type. IMEX=1 can give unpredictable behaviors, such as skipping string values. There is also one small workaround for this problem. Just add single quote (') before every cell value in excel. Every cell will be treated as string.
Add IMEX=1 to the connection string as below:
string con = string.Format(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" + #"Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'", fileName);

Read Excel cell values with SSIS script task

I am trying to read an Excel file via a SSIS ScriptTask to check for certain cell values in that worksheet.
In the code example you can see that the strSQL is set to "H4:H4" to only read one cell. This cell can only have a true or false value.
Since I also need to check for a certain string value in B1 I wanted to extend this version.
string filePath = "c:\\test\\testBoolean.XLSX";
string tabName = "testSheet$";
string strSQL = "Select * From [" + tabName + "H4:H4]";
String strCn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source="
+ filePath + ";Extended Properties=\"Excel 12.0;HDR=NO;IMEX=1\";";
OleDbConnection cn = new OleDbConnection(strCn);
int iCnt = 0;
OleDbDataAdapter objAdapter = new OleDbDataAdapter(strSQL, cn);
DataSet ds = new DataSet();
objAdapter.Fill(ds, tabName);
DataTable dt = ds.Tables[tabName];
foreach (DataRow row in dt.Rows)
{
iCnt = iCnt + 1;
// some processing....
}
What I don't understand is why I get a boolean value with the above strSQL statement or with any statment containing the same row number like so:
string strSQL = "Select * From [" + tabName + "F4:H4]";
Debug-Output:
row.ItemArray[2] false object {bool}
But when I set a different range like this one:
string strSQL = "Select * From [" + tabName + "F1:H4]";
I loose the recognition of the bool value:
row.ItemArray[2] "FALSE" object {string}
I'd much rather like to use the bool value for other processing tasks.
How can I fix this in addition to also reading the B2 value?
Your connection string specified IMEX=1, which tells the driver to treat intermixed data types as text. (See the "Usage Considerations" section of the MSDN article Excel Connection Manager.)
Thus, when you specified a single row
string strSQL = "Select * From [" + tabName + "F4:H4]";
there was only one possible data type for the third column, and the driver was able to correctly infer it. However, when you specified multiple rows
string strSQL = "Select * From [" + tabName + "F1:H4]";
and any value in the range H1:H4 was not a bool, the driver translated all values in that column to strings.
Assuming that you do in fact have mixed data types in column H and only care about the values in two particular cells, the simplest solution is to query each cell individually. See Import a single Excel cell into SSIS for some ideas on how to do that.
I would clone most of the code to produce two separate SELECT statements to query the two different cells you are after with separate SQL statements.
Actually I would probably go further and shred the whole script into SSIS components e.g. Execute SQL Tasks or Data Flow Tasks.

How do I sum all rows of a specific header in an excel file with c#?

I have a excel table with 1 sheet. That sheet has headers in row 1.
One of the headers is Amount.
I want to read all rows from that header and get the sum of it independently of the number or rows, which is never the same, into a variable of type float.
I'm doing this with c#.
I open the workbook, I get the active sheet and then nothing, I get blocked.
How do I go about this?
Rui Martins
You could use OleDB instead of Excel.Interop
string con = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\test.xls;" +
"Extended Properties='Excel 8.0;HDR=Yes;'";
using(OleDbConnection c = new OleDbConnection(con))
{
c.Open();
string selectString = "SELECT SUM(Amount) FROM [Sheet1$]";
using(OleDbCommand cmd1 = new OleDbCommand(selectString))
{
cmd1.Connection = c;
var result = cmd1.ExecuteScalar();
Console.WriteLine(result);
}
}
This example use the old Microsoft.Jet.OleDB.4.0 provider, but works equally with the new Microsoft.ACE.OLEDB.12.0
Take a look at this article, you should be able to loop through the rows and get the total by adding the cell values altogether.
MSDN article on how to retrieve excel cell values.

OleDB Can't Retrieve Rows With Different DataType

I am trying to retrieve DataTable from .xls file. Below are my code:
OleDbConnection MyConnection = null;
DataSet DtSet = null;
OleDbDataAdapter MyCommand = null;
MyConnection = new OleDbConnection("provider=Microsoft.Jet.OLEDB.4.0; Data Source='" + path + "';Extended Properties=Excel 8.0;");
//path is where the .xls file located
ArrayList TblName = new ArrayList();
MyConnection.Open();
DataTable schemaTable = MyConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
foreach (DataRow row in schemaTable.Rows)
{
TblName.Add(row["TABLE_NAME"]);
}
MyCommand = new System.Data.OleDb.OleDbDataAdapter("select * from [" + TblName[0].ToString() + "] order by Material", MyConnection);
DtSet = new System.Data.DataSet();
MyCommand.Fill(DtSet);
MyCommand.FillSchema(DtSet, SchemaType.Source);
DataTable dt = new DataTable();
dt = DtSet.Tables[0];
MyConnection.Close();
Problem is: I have some inconsistent rows in my table, meaning they don't follow the other rows datatype.
Let's say in column A, I have cells that are supposed to be like:
105161610
146161701
196171717
.........
Meaning to say it's supposed to be of Int32 datatype.
These are the majority of the column cells..
I also have some other cells (still in the same column) that look like:
ABC9012
KDJ0981
KLP0001
.......
They somehow follow string datatype.
When I execute the code, I can only Select cells of int type while cells having the other type (string) is set to null instead. Although in my code I basically set the select * explicitly.
Can someone advise me on how to consistently retrieve both kind of datatype (instead of only 1 like what happens now)?
You have to cast or convert both types of data to SQL equivalent of string like varchar.
Try either one of the following:
1. select cast(Column_A as varchar) Column_A from TableName order by Material
2. select convert(varchar, Column_A) Column_A from TableName order by Material
Add excel connection string IMEX=1; HDR={1} like full sting below
Description : IMEX=1 You can force mixed data to be converted to text
HDR={1} indicates that the first row contains column names, not data header row if you dont want then put No

Reading excel but values are coming in different format

I am making another windows form application , but some weird things are happening , first example i have a value 0076464688334 , in my excel sheet i am reading them using..
MyConnection = new System.Data.OleDb.OleDbConnection("provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties=Excel 12.0;");
MyConnection.Open();
myCommand.Connection = MyConnection;
DataSet ds = new DataSet();
String qry = "SELECT number FROM [Sheet1$]";
OleDbDataAdapter odp = new OleDbDataAdapter(qry, MyConnection);
odp.Fill(ds);
NO when i have all the values in dataset i loop them n do some thing , but the problem is the value i mentioned above all all those who have zero at front become like .
0076464688334 = 76464688334
I kind on replaces 0 with %0 and in code %0 with 0 and it solved , now another problem is that i have a value it is becoming...
824968717929 = 8.2496871793e+011
These are bar codes and i need exact match , can not find how to solve them , help please :).
Thank you in advance to all..
Additional code:
for (int i = 0; i < ds.Tables[0].Rows.Count; i++)
{
if (ds.Tables[0].Rows[i][0].ToString() != "" )
{
googleList.Add(ds.Tables[0].Rows[i][0].ToString().Replace("%0", "0"));
// EbayList.Add(ds.Tables[0].Rows[i][0].ToString());
string tmp = string.Empty;
tmp = ds.Tables[0].Rows[i][0].ToString().Replace("%0","0");
You should change the type of the cell in excel because the values in the cell are transformed to simple numbers.
To do that you can set the Format to text, which will then be mapped to a string with your reader.
Alternatively you can use the COM interfaces to read your spreadsheets or try this one:
ExcelDataReader
if they are in open office format (.XLSX files) you can use
EPP Plus

Categories

Resources