I try to import an excel in to my DataTable with condition.
example : - The user have my provided excel import template, with the first row 0 as the column header (ItemCode, QTY, SerialNo, Remarks). But due to the user might accidentally insert few unwanted column name in anywhere of my per-ready column or delete one of my column name.
I try to build a code regardless what happen, the system only detect my standard ready column header (ItemCode, QTY, SerialNo, Remarks). And only will add the column still within the excel and ignore those accidentally delete column name.
What is the best way to code the detection of the column name when is exist before allow to import those specific column into dataTable?
Below is my current excel import code (which i try to add the above condition code)
private DataTable ReadExcelToDataTable(string filePath)
{
tableSalesOrder = new DataTable("dtSO");
string strConn = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text\"", filePath);
using (OleDbConnection dbConnection = new OleDbConnection(strConn))
{
dbConnection.Open();
DataTable dtExcelSchema = dbConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sSheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
dbConnection.Close();
using (OleDbDataAdapter dbAdapter = new OleDbDataAdapter("SELECT DISTINCT * FROM [" + sSheetName + "]", dbConnection)) //rename sheet if required!
dbAdapter.Fill(tableSalesOrder);
}
return tableSalesOrder;
}
I have try to google around, found many hint but still unable to make it work.
Thank you for any advice.
If you just wanted to ignore extra columns, you could use
... new OleDbDataAdapter("Select Distinct ItemCode, QTY, SerialNo, Remarks FROM [" + sSheetName + "] ...
If you need to cope with some of these columns being missing, then it is slightly more complicated. You need to get a list of columns in the excel sheet , eg
DataTable dt = dbConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Columns,
new object[] { null,null, sSheetName, null });
List<string> columnNames = new List<string>();
foreach (DataRow row in dt.Rows)
columnNames.Add(row["Column_name"].ToString());
Then you need to build up your select statement to only include the columns that you want and that exist in the excel sheet.
You could set up your workbooks with named ranges, and extract those. That way it'll work even if they accidentally change the name or insert extra columns. You can select the named range with something like this:
var sql = "SELECT * FROM [Workbook$MyNamedRange]"
using (OleDbDataAdapter dbAdapter = new OleDbDataAdapter(sql, dbConnection));
dbAdapter.Fill(tableSalesOrder);
I solve the issue by using different method, I got the idea from both your guys advise. In fact, after i do some test on my origin code which posted, it only will import according the column name which i define which is ItemCode, Qty, SerialNo & Remakrs at my gridView which my dataTable have assign to it as data source.
The only issue is when one of the column deleted, my import process will be facing problem. This due to the datatable never assign any column name after it created.
I solve it by improve the dataTable set and redefine the column name into It.
if (tableSalesOrder == null)
{
tableSalesOrder = new DataTable("dtSO");
DataColumn colItemCode = new DataColumn("ItemCode",typeof(string));
......
tableSalesOrder.Columns.Add(colItemCode);
......
}
else
{
tableSalesOrder.Clear();
}
Thanks guys for the help. Finally I found where the bugs were.
Related
I have code like this for reading an Excel file:
string connStr = "Provider=Microsoft.ACE.OLEDB.12.0;" +
"Data Source=" + path + ";Extended Properties=\"Excel 12.0;HDR=YES\";";
using (OleDbConnection conn = new OleDbConnection(connStr))
{
conn.Open();
DataTable dtSchema = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
string sheetName = dtSchema.Rows[0].Field("TABLE_NAME");
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [" + sheetName + "]", conn);
sheetAdapter.Fill(sheetData);
DataTable dtColumns = conn.GetSchema("Columns", new string[] { null, null, sheetName, null });
...
}
My code needs to use/look at the column headers. The above only works if the column headers are the first row. Sometimes the Excel files that we receive from clients have a couple rows above the column headers with some metadata about the data in the excel. When this happens the column headers will be on something like row 10.
I can open the Excel file and manually delete the extra rows above the column headers and this solves the issue. But we want to remove this manual step.
Is there any easy way to remove/ignore these extra starting rows above the column headers? Or do I have to come up with custom code? The best way I can think of is to turn off HDR and then the first row that has a value in every column is the column header row. Is there an easier way?
I have code that reads from Excel, needs to ignore the first 11 rows in the worksheet, and read from columns A through P for up to 64000 rows.
// Read columns A - P after skipping 11 rows to read the header row
string ExcelDataQuery = string.Concat("SELECT * FROM [", sheetname, "A12:P64012]");
As far as i know (checked that issue in the past) there is no way to select a table with System.Data.OleDb from excel file using SQL query if headers are not placed in row 1. the solution for me (like you do) is to delete all the rows above the header row before querying the worksheet - just opening the workbook with Microsoft.Office.Interop deleting the extra rows, closing it and than querying it.
Excel is a very powerful tool but was never designed to behave like database (SQL \ access file for example).
I've been working with a program where I import 2 excel files and those excel files have different columns names.. so it could be the possibility for a user to import the wrong excel file (with other column names) and my problem is that I'm reading the data from excel with OledbDataAdapter so I have to specified the name of each column, so when the user import the wrong file the program stop working (because the program can't find the right columns to get the data).
Okay so my question is, is there a way to check if a column exist in specific excel sheet?
So I'll be able to do something if the column doesn't exist in the file the user imported...
Here's a part of my code:
OleDbCommand command1 = new OleDbCommand(
#"SELECT DISTINCT serie FROM [Sheet1$]
WHERE serie =#MercEnInventario AND serie IS NOT NULL", connection);
command1.Parameters.Add(new OleDbParameter("MercEnInventario", MercInv));
string serieN = Convert.ToString(command1.ExecuteScalar());
readerOle = command1.ExecuteReader();
readerOle.Close();
I got an OleDbException when I try to give value to the string 'serieN' because the column name 'serie' doesn't exists in the excel file the user imported.
If you can help me I'll be so grateful :)
OleDbConnection has GetOleDbSchemaTable command that allows you to retrieve just the list of columns. An example code would be
DataTable myColumns = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, new object[] { null, null, "Sheet1$", null });
This will return a DataTable, populated with column information (names, types and more). You can then loop thru Rows collection examining "COLUMN_NAME" column, something like
foreach (DataRow dtRow in myColumns.Rows)
{
if (dtRow["COLUMN_NAME"].ToString() == "serieN") // Do your stuff here ....
}
How about this:
public bool FieldExists(OleDbConnection connection, string tabName, string fieldName)
{
var adapter = new OleDbDataAdapter(string.Format("SELECT * FROM [{0}]", tabName), connection);
var ds = new DataSet();
adapter.Fill(ds, tabName);
foreach (var item in ds.Tables[tabName].Rows[0].ItemArray)
{
if (item.ToString() == fieldName)
return true;
}
return false;
}
With .Net's OleDb I try to import an Excel table in which the first row(s) can be empty. I want to keep the empty row in the DataTable to be able to map the cells to Excel-style cell names "A1, A2, ..." later. But the first line is removed, no matter what I do.
Excel file looks like:
- - -
ABC XY ZZ
1 2 3
4 4 5
Where "-" is an empty cell. (I have no influence to the import format.)
Simplified code:
string cnnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"file.xls\";Extended Properties=\"Excel 8.0;HDR=No;IMEX=1\"";
string mySheet = "Sheet1$";
OleDbConnection connection = new OleDbConnection(cnnStr);
DataSet Contents = new DataSet();
using (OleDbDataAdapter adapter = new OleDbDataAdapter("select * from [" + mySheet + "]", connection))
{
adapter.Fill(Contents);
}
Console.WriteLine(Contents.Tables[0].Rows.Count); // prints: 3
Console.WriteLine(Contents.Tables[0].Rows[0].ItemArray[0]); // prints: ABC
Any idea how to preserve that empty row?
ps: I found How to count empty rows when reading from Excel but couldn't reproduce it.
The issue seems to be related to the TypeGuessRows feature of the OLEDB provider. In a nutshell, data in an Excel column can be of any type. The OLEDB provider guesses the data type by scanning the first 8 rows of the sheet to determine the Majority Type - the data type with the most number of values in the sample. Anything that is not of the Majority Type are discarded.
See this blog post for a more detailed explanation.
As well as this MS KB Article that discusses the behavior.
(Skip down to the Workaround section for the TypeGuessRows behavior)
As a test, I created a file similar to the sample you posted but formatted all of the columns as text and saved the file. Running the code you posted I was able to see 4 Rows returned, with the first Row an empty string.
You may also want to try modifying the registry to see if changing the TypeGuessRows setting to 0 (scan all data in the file to determine data type of each column) helps return the first blank row. My hunch is that this won't help though.
OleDbDataAdapter considers the first row as header.
In order to get the first row, create a datarow from the header of the datatable.
And insert at the first location.
DataTable dt = Contents.Tables[0];
DataRow dr = new DataRow();
int i = 0;
foreach (DataColumn column in dt.Columns)
{
dr[i] = column.ColumnName.ToString();
i++;
}
dt.Rows.InsertAt(dr, 0);
I am using C#.NET 3.5 in VS2010 on Win 7 machine. I have an Excel sheet and I want to extract the data stored in it. I know how to parse the Excel when given the range of cells. Here we connect using OLEDB, give the SQL command and map this to a DataTable, then access the data.
select * from [sheetname$A2:N50]
where "sheetname" is the name of Excel sheet and "$A2:N50" is the range of cells.
BUT, BUT, my requirement is totally different.
I can't hardcode the cell range like above, because the location of data cells may change dynamically. For example: data stored in cell A20 may be changed to C14 in the very next execution.
I need to parse my Excel sheet based on some key word searching. I mean I should search for the keyword "XYZ" and than parse the table just below it. This key word may change its position for every execution.
Since I don't know the cell range, I can't even get the Excel data into a DataTable using the above query.
Instead of selecting from a cell range you can fill all the data into a datatable and query from that.
DataTable dt = new DataTable();
try
{
OleDbConnection con = new OleDbConnection(string.Format(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=""Excel 8.0;HDR=yes;IMEX=1""", excelPath));
OleDbDataAdapter da = new OleDbDataAdapter("select * from [sheetname$]", con);
da.Fill(dt);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
return;
}
//now you can use dt DataTable
foreach (DataRow dr in dt.Rows)
{
//....
}
hope this helps...
I need to help to generate column name from excel automatically. I think that: we can do below codes:
CREATE TABLE [dbo].[Addresses_Temp] (
[FirstName] VARCHAR(20),
[LastName] VARCHAR(20),
[Address] VARCHAR(50),
[City] VARCHAR(30),
[State] VARCHAR(2),
[ZIP] VARCHAR(10)
)
via C#. How can I learn column name from Excel?
private void Form1_Load(object sender, EventArgs e)
{
ExcelToSql();
}
void ExcelToSql()
{
string connectionString = #"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Source\MPD.xlsm;Extended Properties=""Excel 12.0;HDR=YES;""";
// if you don't want to show the header row (first row)
// use 'HDR=NO' in the string
string strSQL = "SELECT * FROM [Sheet1$]";
OleDbConnection excelConnection = new OleDbConnection(connectionString);
excelConnection.Open(); // This code will open excel file.
OleDbCommand dbCommand = new OleDbCommand(strSQL, excelConnection);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(dbCommand);
// create data table
DataTable dTable = new DataTable();
dataAdapter.Fill(dTable);
// bind the datasource
// dataBingingSrc.DataSource = dTable;
// assign the dataBindingSrc to the DataGridView
// dgvExcelList.DataSource = dataBingingSrc; // dispose used objects
if (dTable.Rows.Count > 0)
MessageBox.Show("Count:" + dTable.Rows.Count.ToString());
dTable.Dispose();
dataAdapter.Dispose();
dbCommand.Dispose();
excelConnection.Close();
excelConnection.Dispose();
}
You should be able to iterate over the DataTable's columns collection to get the column names.
System.Data.DataTable dt;
dt = new System.Data.DataTable();
foreach(System.Data.DataColumn col in dt.Columns)
{
System.Diagnostics.Debug.WriteLine(col.ColumnName);
}
Does it have to be C#? If you're willing to use Java, I've had really good results with Apache POI: http://poi.apache.org/
This is not a C# solution... it is a quick and dirty solution right from excel.
A c# solution would be more robust and allow you to most likely point it to a target xls and have it give you the answers - this solution is for if you need the answers fast and don't have time to write a program or if someone does not have C# development environment on their computer.
One possible way to get the results you're looking for is:
highlight the row in excel that has the column headers
copy them
go to a new worksheet
right click cell A1
click paste-transpose
it will paste them in column format
go to B2 and paste this formula in:
=CONCATENATE("[",SUBSTITUTE(A1," ",""),"] varchar(20),")
then paste that formula all the way down next to your column of column headers
copy the results into SQL Server then add your top line of code
"CREATE TABLE [dbo].[Addresses_Temp] ( "
then add your closing parentheses
What we did is:
we got all the colunn headers from the header ROW and
made them into a column
then removed all spaces (should they be multiword column headers) and
tacked onto the beginning the open bracket "[" and
tacked onto the end "] VARCHAR(20)," (the rest of the line of code)