C#: Read Excel file failed with cell format look like number - c#

I use this function to read excel file to datatable:
public static DataTable exceldata(string filePath)
{
try
{
DataTable dtexcel = new DataTable();
bool hasHeaders = false;
string HDR = hasHeaders ? "Yes" : "No";
string strConn;
if (filePath.Substring(filePath.LastIndexOf('.')).ToLower() == ".xlsx")
strConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=YES;TypeGuessRows=0;ImportMixedTypes=Text\"";
else
strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filePath + ";Extended Properties=\"Excel 8.0;IMEX=1;HDR=YES;TypeGuessRows=0;ImportMixedTypes=Text\"";
OleDbConnection conn = new OleDbConnection(strConn);
conn.Open();
DataTable schemaTable = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
DataRow schemaRow = schemaTable.Rows[0];
string sheet = schemaRow["TABLE_NAME"].ToString();
if (!sheet.EndsWith("_"))
{
string query = "SELECT * FROM [" + sheet + "]";
OleDbDataAdapter daexcel = new OleDbDataAdapter(query, conn);
dtexcel.Locale = CultureInfo.CurrentCulture;
daexcel.Fill(dtexcel);
}
conn.Close();
return dtexcel;
}
catch (Exception ex)
{
return null;
}
}
My trouble is: When Excel file have some cells that format look like number (ex: 301/1, 181/3 .... ), the datatable return empty value in that cells.
How can read Excel files to datatable that cells as string, not number detected ?

I know you've specifically asked to use an excel sheet. As someone who has faced the exact same issue, I would recommend you to use ClosedXML instead of using an Oledb connection.
Closed XML is much faster to read and allows you to get contents of an entire sheet or a cell very easily.
I've answered another question similar to this one here
https://stackoverflow.com/a/47235686/6729097

Related

C# exception: External table is not in the expected format. while using excel with oledb

If anyone can help me out it will be very grateful. I am trying to read an excel (.xlsx, excel-2007) which have different sheets (Headers are not fixed). The below code works for me in most of the cases, but throws exception in some of the cases as entitled.
public static bool ReadExcelData(string ExcelFilePath, string SheetName, out DataTable dt)
{
dt = new DataTable();
bool isXlsx = ExcelFilePath.Substring(ExcelFilePath.LastIndexOf('.') + 1).ToLower() == "xlsx";
string excelConnectString = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" + ExcelFilePath + ";Extended Properties=\"Excel 8.0;HDR=yes;\"";
if (isXlsx)
excelConnectString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + ExcelFilePath + ";Extended Properties=\"Excel 12.0\";";
OleDbConnection objConn = null;
try
{
objConn = new OleDbConnection(excelConnectString);
if (objConn.State == ConnectionState.Closed)
{
objConn.Open();
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
}
}
catch (Exception ex)
{
dt = null;
return false;
}
try
{
dt.Clear();
string query = "select * from ["+SheetName+"$] ";
OleDbCommand objCmd = new OleDbCommand(query, objConn);
OleDbDataAdapter objDatAdap = new OleDbDataAdapter();
objDatAdap.SelectCommand = objCmd;
objDatAdap.Fill(dt);
Boolean result = (dt.Rows.Count >= 1) ? true : false;
objConn.Close();
return true;
}
catch (Exception ex)
{
dt = null;
return false;
}
}
If, in case of exception, I open this excel (on which it is giving error) manually (double clicking the excel) before going in to the code, it will not generate any exception, rather reads that excel smoothly.
What can be better or alternative way so that it may work for all the cases?
Issue is in your excel sheet, not in your code, please just saveas your excel sheet in .xls or .xlsx format again and then use the same code. It will work.

Can only read Excel file when it is actually open in Ms Excel

I am using the following code to open an excel file (XLS) and populate a DataTable with the first worksheet:
var connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=Excel 8.0;", filename);
OleDbConnection connExcel = new OleDbConnection(connectionString);
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
connExcel.Close();
var adapter = new OleDbDataAdapter("SELECT * FROM [" + SheetName + "]", connectionString);
var ds = new DataSet();
int count = 0;
adapter.Fill(ds, SheetName);
DataTable dt = ds.Tables[0];
It works only when the file is already open in Ms Excel. Why could that be?
If the file is not open, I get an error message (on line connExcel.Open): External table is not in the expected format.
I'm facing the same problem and accordingly to this site, many developers are struggling for the same:
-When I try read Excel with OLE DB all values are empty
-Can't connect to excel file unless file is already open
Actually I'm using the classic connection string (note that I'm trying to read a 97/2003 file):
Provider=Microsoft.Jet.OLEDB.4.0; Data Source = " + GetFilename(filename) + "; Extended Properties ='Excel 8.0;HDR=NO;IMEX=1'
but the file can be read properly only if:
Is open in Excel or even in Word! (the file of course appears corrupted and unreadable, but then the OleDb procedure can read every line of the file), I didn't try with other Office apps
The file is not in read-only mode
I also tried to lock the file manually or to open it with other non-office applications, but the result is not the same. If I follow the two previous rules (file opened in Word or Excel in not read-only mode) I can see all the cells, otherwise it seems the first column is ignored completely (so F2 became F1, F3 became F2,... and F6, the last one, should became F5 otherwise it throws and out-of-index error).
In order to keep compatibility with OleDb without using 3rd parties libraries I found a very stupid workaround using Microsoft.Office.Interop.Excel assembly.
Excel.Application _app = new Excel.Application();
var workbooks = _app.Workbooks;
workbooks.Open(_filename);
// OleDb Connection
using (OleDbConnection conn = new OleDbConnection(connectionOleDb))
{
try
{
conn.Open();
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
cmd.CommandText = String.Format("SELECT * FROM [{0}$]", tableName);
OleDbDataReader myReader = cmd.ExecuteReader();
int i = 0;
while (myReader.Read())
{
//Here I read through all Excel rows
}
}
catch (Exception E)
{
MessageBox.Show("Error!\n" + E.Message);
}
finally
{
conn.Close();
workbooks.Close();
if (workbooks != null)
System.Runtime.InteropServices.Marshal.ReleaseComObject(workbooks);
_app.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(_app);
}
}
Essentially the first 3 lines run an Excel instance that lasts exactly the time needed to OleDb to perform its tasks.
The last 4 lines, inside the finally block, let the Excel instance to be closed correctly, immediately after the task and avoid ghost Excel processes.
I repeat it's a very stupid workaround that also requires a 1,5 MB dll (Microsoft.Office.Interop.Excel.dll) to be added to the project.
Anyway seems impossible that OleDb cannot manage by itself the missing data...
I had the same problem. If the file was open the read was ok but if the file was closed... some thing was strange... in my case I received strange data from columns and values.. Debugging I found the name of the first sheet and was strange ["xls _xlnm#_FilterDatabase"] looking on the internet I found that's a name of hidden sheet and a trick to avoid read this sheet (HERE) and so I've implemented a method:
private string getFirstVisibileSheet(DataTable dtSheet, int index = 0)
{
string sheetName = String.Empty;
if (dtSheet.Rows.Count >= (index + 1))
{
sheetName = dtSheet.Rows[index]["TABLE_NAME"].ToString();
if (sheetName.Contains("FilterDatabase"))
{
return getFirstVisibileSheet(dtSheet, ++index);
}
}
return sheetName;
}
To me worked very well.
My complete example code is:
string excelFilePath = String.Empty;
string stringConnection = String.Empty;
using (OpenFileDialog openExcelDialog = new OpenFileDialog())
{
openExcelDialog.Filter = "Excel 2007 (*.xlsx)|*.xlsx|Excel 2003 (*.xls)|*.xls";
openExcelDialog.FilterIndex = 1;
openExcelDialog.RestoreDirectory = true;
DialogResult windowsResult = openExcelDialog.ShowDialog();
if (windowsResult != System.Windows.Forms.DialogResult.OK)
{
return;
}
excelFilePath = openExcelDialog.FileName;
using (DataTable dt = new DataTable())
{
try
{
if (!excelFilePath.Equals(String.Empty))
{
stringConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excelFilePath + ";Extended Properties='Excel 8.0; HDR=YES;';";
using (OleDbConnection conn = new OleDbConnection(stringConnection))
{
conn.Open();
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetName = getFirstVisibileSheet(dtSheet);
cmd.CommandText = "SELECT * FROM [" + sheetName + "]";
dt.TableName = sheetName;
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
da.Fill(dt);
cmd = null;
conn.Close();
}
}
//Read and Use my DT
foreach (DataRow row in dt.Rows)
{
//On my case I need data on first and second Columns
if ((row.ItemArray.Count() < 2) ||
(row[0] == null || String.IsNullOrWhiteSpace(row[0].ToString()))
||
(row[1] == null ||String.IsNullOrWhiteSpace(row[1].ToString())))
{
continue;
}
//Get the number from the first COL
int colOneNumber = 0;
Int32.TryParse(row[0].ToString(), out colOneNumber);
//Get the string from the second COL
string colTwoString = row[1].ToString();
//Get the string from third COL if is a file path valid
string colThree = (row.ItemArray.Count() >= 3
&& !row.IsNull(2)
&& !String.IsNullOrWhiteSpace(row[2].ToString())
&& File.Exists(row[2].ToString())
) ? row[2].ToString() : String.Empty;
}
}
catch (Exception ex)
{
MessageBox.Show("Import error.\n" + ex.Message, "::ERROR::", MessageBoxButtons.OK, MessageBoxIcon.Error);
}
}
}
private string getFirstVisibileSheet(DataTable dtSheet, int index = 0)
{
string sheetName = String.Empty;
if (dtSheet.Rows.Count >= (index + 1))
{
sheetName = dtSheet.Rows[index]["TABLE_NAME"].ToString();
if (sheetName.Contains("FilterDatabase"))
{
return getFirstVisibileSheet(dtSheet, ++index);
}
}
return sheetName;
}
Is it failing on ToString(), like here?
Error is "Object reference not set to an instance of an object"
Does Convert.ToString() fix anything?

While reading Excel sheet if the value of column is numberic then it returns null in datatable

I am reading values from Excel sheet. Columns generally contain String but sometimes it may contain numeric values. While reading Excel sheet into datatable numeric value is read as blank.
It reads 90004 as null, But if I sort this column by numeric the it reads numeric value and gives string value as null and if I sort this column by string then it reads string value and gives numeric as null.
AC62614 abc EA MISC
AC62615 pqr EA MISC
AC62616 xyz EA MISC
AC62617 test EA 90004
AC62618 test3 TO MISC
AC62619 test3 TO STEEL
my code:
public static DataTable ReadExcelFile(FileUpload File1, string strSheetName)
{
string strExtensionName = "";
string strFileName = System.IO.Path.GetFileName(File1.PostedFile.FileName);
DataTable dtt = new DataTable();
if (!string.IsNullOrEmpty(strFileName))
{
//get the extension name, check if it's a spreadsheet
strExtensionName = strFileName.Substring(strFileName.IndexOf(".") + 1);
if (strExtensionName.Equals("xls") || strExtensionName.Equals("xlsx"))
{
/*Import data*/
int FileLength = File1.PostedFile.ContentLength;
if (File1.PostedFile != null && File1.HasFile)
{
//upload the file to server
//string strServerPath = "~/FolderName";
FileInfo file = new FileInfo(File1.PostedFile.FileName);
string strServerFileName = file.Name;
string strFullPath = HttpContext.Current.Server.MapPath("UploadedExcel/" + strServerFileName);
File1.PostedFile.SaveAs(strFullPath);
//open connection out to read excel
string strConnectionString = string.Empty;
if (strExtensionName == "xls")
strConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ strFullPath
+ ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
else if (strExtensionName == "xlsx")
strConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source="
+ strFullPath
+ ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
if (!string.IsNullOrEmpty(strConnectionString))
{
OleDbConnection objConnection = new OleDbConnection(strConnectionString);
objConnection.Open();
DataTable oleDbSchemaTable = objConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
bool blExists = false;
foreach (DataRow dtr in oleDbSchemaTable.Rows)
{
//reads from the spreadsheet called 'Sheet1'
if (dtr["TABLE_NAME"].ToString() == "" + strSheetName + "$")
{
blExists = true;
break;
}
}
if (blExists)
{
OleDbCommand objCmd = new OleDbCommand(string.Format("Select * from [{0}$]", strSheetName), objConnection);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = objCmd;
DataSet objDataSet = new DataSet();
objAdapter1.Fill(objDataSet);
objConnection.Close();
dtt = objDataSet.Tables[0];
}
}
}
}
}
return dtt;
}
If you change IMEX=2 in you connectionstring to IMEX = 1 the columns will be interpreted as text. Then you can get all data of the Sheet and check with Int32.TryParse() if the value is numeric or not.
Connecting string:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Path + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1\"
IMEX=1 For Read All value as Text from Excel file
HDR =Yes For Excel First row as column read
IMEX=1 not use Than Numeric value will be NULL read

Search a row from a table which starts with asterisk symbol

I am uploading an excel file by using c# and i am selecting a column named LOT from that excel file.In column LOT one row is having number starting with asterisk symbol(*). I need a query to select all the rows. please Help me out in this.
I tried the below sample code but it is not working fine. Its returning the lots which doesn't have asterisk symbol(*).
protected void btnUpload_Click(object sender, EventArgs e)
{
//Get path from web.config file to upload
string FilePath = ConfigurationManager.AppSettings["FilePath"].ToString();
string filename = string.Empty;
//To check whether file is selected or not to uplaod
if (BrowseFile.HasFile)
{
try
{
string[] allowdFile = { ".xls", ".xlsx" };
//Here we are allowing only excel file so verifying selected file pdf or not
string FileExt = System.IO.Path.GetExtension(BrowseFile.PostedFile.FileName);
//Check whether selected file is valid extension or not
bool isValidFile = allowdFile.Contains(FileExt);
if (!isValidFile)
{
lblMsg.ForeColor = System.Drawing.Color.Red;
lblMsg.Text = "Please upload only Excel";
}
else
{
// Get size of uploaded file, here restricting size of file
int FileSize = BrowseFile.PostedFile.ContentLength;
if (FileSize <= 1048576)//1048576 byte = 1MB
{
//Get file name of selected file
filename = Path.GetFileName(Server.MapPath(BrowseFile.FileName));
//Save selected file into server location
BrowseFile.SaveAs(Server.MapPath(FilePath) + filename);
//Get file path
string filePath = Server.MapPath(FilePath) + filename;
//Open the connection with excel file based on excel version
OleDbConnection con = null;
if (FileExt == ".xls")
{
con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filePath + ";Extended Properties=Excel 8.0;");
}
else if (FileExt == ".xlsx")
{
con = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties=Excel 12.0;");
}
con.Open();
//Get the list of sheet available in excel sheet
DataTable dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
//Get first sheet name
string getExcelSheetName = dt.Rows[0]["Table_Name"].ToString();
//Select rows from first sheet in excel sheet and fill into dataset
//OleDbCommand ExcelCommand = new OleDbCommand(#"SELECT LOT FROM [" + getExcelSheetName + #"] WHERE LOT IS NOT NULL", con);
OleDbCommand ExcelCommand = new OleDbCommand(#"SELECT LOT FROM [" + getExcelSheetName + #"] WHERE LOT LIKE '**'", con);
OleDbDataAdapter ExcelAdapter = new OleDbDataAdapter(ExcelCommand);
DataTable ExcelDataSet = new DataTable();
ExcelAdapter.Fill(ExcelDataSet);
//Holding the data into the list
List<DataRow> strLot = ExcelDataSet.AsEnumerable().ToList();
con.Close();
//string lots = "";
//seperating the each lot with a comma separator
for (int i = 0; i < strLot.Count; i++)
{
totLots += ExcelDataSet.Rows[i].ItemArray.GetValue(0).ToString() + ",";
}
//Removing the last comma separator
totLots = totLots.Remove(totLots.Length - 1);
You can use the LIKE operator
WHERE <Column> LIKE '\**'
Also, depending on your data source, the wildcard '*' could be replaced with '%' and 'LIKE' keyword could be replaced with 'ALIKE'
with '%' the search text will become '*%' instead of '\**'
Edit: Code below checked with .xls and .xlsx files
OleDbCommand ExcelCommand = new OleDbCommand(#"SELECT LOT FROM [" + getExcelSheetName + #"] WHERE LOT LIKE '*%'", con);
Change your command text to the following...
#"SELECT [LOT] FROM [" + getExcelSheetName + #"] WHERE [LOT] LIKE '\*%')"
This should preform a simple pattern match like a SQL script and look for any member of the column starting with an asterisk and containing anything at all after that.

Import from Excel in C#

I use following code to save data from excel file into table
private void RetrieveAndStoreExcelData(String filePath)
{
String excelConnectionStr = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source='" + filePath + "'; Extended Properties='Excel 8.0;'";
OleDbConnection excelConnection = new OleDbConnection(excelConnectionStr);
excelConnection.Open();
try
{
//Get the name of the first worksheet
DataTable schema = excelConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (schema == null || schema.Rows.Count < 1)
{
throw new Exception("Error: Could not determine the name of the first worksheet.");
}
string firstSheetName = schema.Rows[0]["TABLE_NAME"].ToString();
//Retrieve data from worksheet into reader
string query = "SELECT * FROM [" + firstSheetName + "]";
OleDbCommand command = new OleDbCommand(query, excelConnection);
OleDbDataReader dbReader = command.ExecuteReader(); //IEnumerable
//populate IEnumerable
if (dbReader.HasRows)
{
populateRecords(dbReader);
}
}
finally
{
excelConnection.Close();
}
}
This works fine. But if one of the fields length is greater than 255 characters then it truncates the string to 255 and that is also when that row appears after 10th row in excel sheet.
So, if first 10 rows is having length less than 255, it assumes that all rows will have length less than 255 characters.
So is there any way to solve this?

Categories

Resources