DataSet using first value as column header - c#

I have a piece of code where I want to extract values from the A column of an Excel sheet. Right now, here is the code I'm using and having an issue with:
m_connString = "Provider = Microsoft.ACE.OLEDB.12.0; Data Source = " + m_source + "; Extended properties = 'Excel 12.0; HDR = NO; IMEX = 1;';";
using (OleDbConnection conn = new OleDbConnection(m_connString))
{
conn.Open();
DataTable dt = conn.GetOleDbSchemaGuid(Tables, null);
DataSet ds = new DataSet();
string defaultSheet = ExcelSheets.Rows[0]["TABLE_NAME"].ToString();
OleDbCommand comm = new OleDbCommand("SELECT * FROM [" + defaultSheet + "]", conn);
OleDbDataAdapter adapter = new OleDbDataAdapter(comm);
// Bug appears here
adapter.fill(ds)
// Fill a List<string> with the data found
for (int r = 0; r < ds.Tables[0].Row.Count; r++)
{
m_list.Add(ds.Tables[0].Rows[r][0].ToString();
}
}
What is happening is that, if I have an Excel file with the following content in the A column:
Row1
Row2
Row3
...
RowX
...What I end up getting is all values except for the first value (Row1). It turns out that Row1 is being used as a column name(?) for the DataSet's table. However, I don't want there to be any column names or headers, and I specifically state this in the connection string.
How can I prevent this behavior so I can have all my data placed in the List? Or, failing that, how can I work around this issue and extract Row1 from that DataSet?

Check Microsoft Reference to understand how Extended Properties of the connection HDR=NO works:
Column headings: By default, it is assumed that the first row of your
Excel data source contains columns headings that can be used as field
names. If this is not the case, you must turn this setting off, or
your first row of data "disappears" to be used as field names. This is
done by adding the optional HDR= setting to the Extended Properties of
the connection string. The default, which does not need to be
specified, is HDR=Yes. If you do not have column headings, you need to
specify HDR=No; the provider names your fields F1, F2, etc.
Here is example:
Excel File Data (test.xlsx):
Code:
string m_source = "test.xlsx";
string m_connString = #"Provider = Microsoft.ACE.OLEDB.12.0;
Data Source = " + m_source + #";
Extended properties = 'Excel 12.0;
HDR= NO;
IMEX = 1;';";
using (OleDbConnection conn = new OleDbConnection(m_connString))
{
conn.Open();
string squery = "SELECT f1, f2, f3 FROM [Sheet1$]";
OleDbCommand comm = new OleDbCommand(squery, conn);
OleDbDataAdapter adapter = new OleDbDataAdapter(comm);
DataSet ds = new DataSet();
adapter.Fill(ds);
}
DataSet Visualizer:

Try this way:
DataTable table = ds.Tables[0];
foreach (DataColumn column in table.Columns)
{
string cName = table.Rows[0][column.ColumnName].ToString();
if (!table.Columns.Contains(cName) && cName != "")
{
column.ColumnName = cName;
}
}

change the sql to something along the lines of this
"SELECT * FROM [" + defaultSheet + "] Except Select Top(1)"

Related

SqlBulkCopy mapping issue with fullstops in column names

I'm trying to import Excel sheet to SQL Server database. The issue happening is with the column mapping.
If the Column Name in Excel Sheet ends with fullstop (eg: 'No.', 'Name.'), C# is throwing an exception
Message=The given ColumnName 'No.' does not match up with any column
in data source.
But if I remove fullstop, it is working absolutely fine.
The source code for mapping in C# is as follows
private void InsertExcelRecords()
{
string FilePath = "C:\\Upload\\" + FileUpload.FileName;
string fileExtension = Path.GetExtension(FileUpload.PostedFile.FileName);
FileUpload.SaveAs("C:\\Upload\\" + FileUpload.FileName);
ExcelConn(FilePath, fileExtension);
Econ.Open();
DataTable dtExcelSheetName = Econ.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string getExcelSheetName = dtExcelSheetName.Rows[0]["Table_Name"].ToString();
Query = string.Format("Select * FROM [{0}]", getExcelSheetName + "A7:I");
OleDbCommand Ecom = new OleDbCommand(Query, Econ);
DataSet ds = new DataSet();
OleDbDataAdapter oda = new OleDbDataAdapter(Query, Econ);
Econ.Close();
oda.Fill(ds);
DataTable Exceldt = ds.Tables[0];
connection();
SqlBulkCopy objbulk = new SqlBulkCopy(con);
objbulk.DestinationTableName = "BankTransaction";
objbulk.ColumnMappings.Add("No", "Number");
con.Open();
objbulk.WriteToServer(Exceldt);
con.Close();
}
Please let me know if you need any more information.
You will not be able to retrieve column names with . from an excel sheet using OLEDB or ODBC. Because it is not a valid or recognizable syntax.
'.' typically we use it to distinguish between two [schema].[table].[column] like that.
OLEDB,ODBC Replace column name '.' char with '#'
So you need to replace your code
objbulk.ColumnMappings.Add("No.", "Number")
with
objbulk.ColumnMappings.Add("No#", "Number")

How to insert data from excel to datatable which has headers?

I am reading excel having like million of records first i query my table (no records) and get Datable. i query my table to get columns name as define in my excel sheet using alias.
var dal = new clsConn();
var sqlQuery = "SELECT FETAPE_THEIR_TRANDATE \"Date\" ,ISSUER Issuer, ISSU_BRAN Branch , STAN_NUMB STAN, TERMID TermID, ACQUIRER Acquirer,DEBIT_AMOUNT Debit,CREDIT_AMOUNT Credit,CARD_NUMB \"Card Number\" , DESCRIPTION Description FROM ALLTRANSACTIONS";
var returntable = dal.ReadData(sqlQuery);
DataRow ds = returntable.NewRow();
var dtExcelData = returntable;
So my datatable looks like this,
Then i read records from excel sheet
OleDbConnection con = null;
if (ext == ".xls")
{
con = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filepath + ";Extended Properties=Excel 8.0;");
}
else if (ext == ".xlsx")
{
con = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;IMEX=2;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\"");
}
con.Open();
dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string getExcelSheetName = dt.Rows[0]["Table_Name"].ToString();
//OleDbCommand ExcelCommand = new OleDbCommand(#"SELECT * FROM [" + getExcelSheetName + #"]", con);
OleDbCommand ExcelCommand = new OleDbCommand("SELECT F1, F2, F3, F4, F5,F6,F7,F8,F9,F10 FROM [Sheet1$]", con);
OleDbDataAdapter ExcelAdapter = new OleDbDataAdapter(ExcelCommand);
try
{
ExcelAdapter.Fill(dtExcelData); //Here I give the datatable which i made previously
}
catch (Exception ex)
{
//lblAlert2.CssClass = "message-error";
//lblAlert2.Text = ex.Message;
}
It reads successfully and fill data in datatable but creating its own column in data table like F1 to F10 how can i move this data to exactly match with my defined columns in datatable
How Will i manage this to not create other columns (f1,f2..f10)
any workaround will be appreciable or Please explain what i am doing wrong and how can i achieve this.
UPDATE :
My Excel file looks like this
The Microsoft.ACE.OLEDB.12.0 driver will handle both types of excel spreadsheets and using the same Extended Properties. i.e. "Excel 12.0" will open both .xls and .xlsx.
Leave the HDR=NO as OLEDB expects them in the first row and they are actually in row 11.
Sadly "TypeGuessRows=0;ImportMixedTypes=Text" is completely ignored by Microsoft.ACE.OLEDB.12.0, you've got to play around with the registry (yuk). Change your IMEX=2 to IMEX=1 to ensure that mixed data types as handled as text.
Change back to using "Select * From [Sheet1$]" and then I'm afriad that you are going to have to handle the source data manually.
OleDbConnection con = new OleDbConnection(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=NO\"");
con.Open();
DataTable dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string getExcelSheetName = (string)dt.Rows[0]["Table_Name"];
DataTable xlWorksheet = new DataTable();
xlWorksheet.Load(new OleDbCommand("Select * From [" + getExcelSheetName + "]", con).ExecuteReader());
//More than 11 rows implies 11 header rows and at least 1 data row
if (xlWorksheet.Rows.Count > 11 & xlWorksheet.Columns.Count >= 10)
{
for (int nRow = 11; nRow < xlWorksheet.Rows.Count; nRow++)
{
DataRow returnRow = returntable.NewRow();
for (int nColumn = 0; nColumn < 10; nColumn++)
{
//Note you will probably get conversion problems here that you will have to handle
returnRow[nColumn] = xlWorksheet.Rows[nRow].ItemArray[nColumn];
}
returntable.Rows.Add(returnRow);
}
}
I'm guessing you simply want to add the excel data into your ALLTRANSACTION table? You don't specify but it seems the likely outcome of this. If so this is a terrible way to do it. You don't need to read the whole table into memory append data and then update the database. All you need to do is read the excel file and insert the data to the Oracle table.
Some thoughts, your returntable will contain data so if you just want the structure of the table then add a "Where RowNum=0" to the Select statement. To add the data to your Oracle Database you could 1) Convert to using the Oracle Data Provider (ODP) and then use using OracleBulkCopy Class or 2) simply modify the above to insert row by row as you read the data. As long as you don't have a LOT of data in your Excel spreadsheet it will work just fine. Having said that a Million rows is a LOT so perhaps not the best option. You will need to validate the input as Excel is not the best data source really.

how to get data from datatable using column name

i have a situation where i load a dataset with excel file. All the worksheet are loaded as datatable with the appropriate worksheet name as datatable name. What i am trying to do is get this datatable value using column name. But i am not get error saying
"Column 'Execute' does not belong to table Sheet1".
While loaded excel to datatabel i have used HDR=YES and IMEX=1. I also tried with HDR=NO. nothing is working.
below code is to write excel to datatable
foreach (Microsoft.Office.Interop.Excel.Worksheet wsheet in workbook.Worksheets)
{
string sql1 = "SELECT * FROM [" + wsheet.Name + "$]";
OleDbCommand selectCMD1 = new OleDbCommand(sql1, SQLConn);
SQLAdapter.SelectCommand = selectCMD1;
SQLAdapter.Fill(dataset.Tables.Add(wsheet.Name));
}
Data from excel loads to each sheet perfectly. but fetching it by column name is the problem.
any suggestions please
For what it's worth, this is the code I used, which works fine. I assume you are using Interop to loop through the worksheets to make sure you get them in order.
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filename +
"; Extended Properties=\"Excel 12.0 XML;HDR=YES\"";
DataSet dsValues = new DataSet();
using (OleDbConnection conn = new OleDbConnection(connectionString))
{
conn.Open();
using (OleDbCommand cmd = conn.CreateCommand())
{
using (OleDbDataAdapter adapter = new OleDbDataAdapter())
{
foreach (Excel.Worksheet wsheet in workbook.Worksheets)
{
cmd.CommandText = "SELECT * FROM [" + wsheet.Name + "$]";
adapter.SelectCommand = cmd;
adapter.Fill(dsValues.Tables.Add(wsheet.Name));
}
}
}
}
If the OleDbDataAdapter cannot find text in the top cells of each column, then it reverts to the F1, F2, F3... notation for the missing header names. So, for example, if my Excel worksheet looks like this:
Header1 Header3
Value1 Value3 Value5
Value2 Value4 Value6
Then in the DataTable I will get columns called Header1, F2, Header3.
You also need to make sure that the row you designate as the header row has no text in any row above it, otherwise you'll get a bunch of Fn-type headers and then other unexpected text as header names. For example:
This is my table
Header1 Header2 Header3
Value1 Value3 Value5
Value2 Value4 Value6
Will end up in the DataTable with headers of This is my table, F2, F3, etc.
Try enumerating (in debug mode and/or console) through wsheet.Name.Columns and extract the column names for the table - make sure that it contains the column names that you expect (and in the format that you expect)

Unable to read first row from Excel using SqlBulkCopy

I am using following code to update an Excel file into SQL Server. Code is working but it is not able to insert first row into table.
OleDbConnection OleDb = new OleDbConnection(ConnectionString);
OleDbCommand OleDbCmm = new OleDbCommand(Query,OleDb);
OleDbDataReader OleDbdr;
OleDb.Open();
if (OleDb.State == ConnectionState.Open)
{
OleDbdr = OleDbCmm.ExecuteReader();
SqlBulkCopy BulkCopy = new SqlBulkCopy(ConfigurationManager.ConnectionStrings["connstring"].ToString());
BulkCopy.DestinationTableName = "TempTable";
if (OleDbdr.Read())
{
BulkCopy.WriteToServer(OleDbdr);
}
}
OleDb.Close();
Even I was facing same problem, this is because I was using Read() Method like below.
while (dr.Read()) {
bulkcopy.WriteToServer(dr);
}
Solution to above problem is remove dr.Read() method and while loop
use
bulkcopy.WriteToServer(dr)
without any condition and Read() Method.
One possible reason for this may be that you've indicated in your connection string that first row contains column names (HDR=YES) thus that row is not treated as containing data.
EDIT
Another possible reason for this would be the call to OleDbDataReader.Read() method before passing the reader to SqlBulkCopy object. MSDN states:
The copy operation starts at the next available row in the reader. Most of the time, the reader was just returned by ExecuteReader or a similar call, so the next available row is the first row.
Thus, in your case you should not call OleDbdr.Read() because this advances the reader to the first row; you should let BulkCopy call Read() and it will start reading from the first row.
Your code should be:
OleDbdr = OleDbCmm.ExecuteReader();
SqlBulkCopy BulkCopy = new SqlBulkCopy(ConfigurationManager.ConnectionStrings["connstring"].ToString());
BulkCopy.DestinationTableName = "TempTable";
BulkCopy.WriteToServer(OleDbdr);
You need to set header for the Excel to sql as I have done in my
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + Your D S+ ";" +
"Extended Properties=Excel 8.0;";
OleDbConnection sSourceConnection = new OleDbConnection(conn);
using (sSourceConnection)
{
DataTable dtExcelData = new DataTable();
string[] SheetNames = GetExcelSheetNames(strFileName);
string[] preColumnHeader = new string[]{ "CarrierId", "StateId", "TerrCd", "ProgramId", "ClassId",
"PremTypeID","Limit50_100", "Limit100_100", "Limit100_200", "Limit300_300", "Limit300_600",
"Limit500_500","Limit500_1mil", "Limit1mil_1mil", "Limit1mil_2mil", "OtherParameter" };
sSourceConnection.Open();
string strQuery = string.Empty;
strQuery = "SELECT * FROM [" + SheetNames[0] + "]";
OleDbDataAdapter oleDA = new OleDbDataAdapter(strQuery, sSourceConnection);
oleDA.Fill(dtExcelData);
sSourceConnection.Close();
string[] colName = new string[dtExcelData.Columns.Count];
int i = 0;
foreach (DataColumn dc in dtExcelData.Columns)
{
colName[i] = dc.ColumnName;
i++;
}
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connStr))
{
bulkCopy.DestinationTableName = "tbl_test";
bulkCopy.WriteToServer(dtExcelData);
}
}

query to read a column excel worksheet with OLEDB

I'm totally new with OleDB and reading excel files. I have a worksheet with 3 columns (Name - Surname - E-mail Address) and I need to:
know the rows number
read all the addresses in the third columns
extract one by one each address
I use an OpenFileDialog object (ofd) and a TextBox (excel) to display the selected file. This is my code:
if (ofd.ShowDialog() == DialogResult.OK)
{
excel.Text = ofd.FileName;
connection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excel.Text + ";Extended Properties=\"Excel 12.0 Xml;HDR=NO;IMEX=1\"";
conn.ConnectionString = connection;
conn.Open();
string name_query = "SELECT A FROM[" + ofd.SafeFileName + "]";
OleDbDataAdapter da = new OleDbDataAdapter(name_query, conn);
da.Fill(table);
conn.Close();
j = table.Rows.Count;
}
It doesn't work, a query problem in the "FROM...". I usually read this type of query:
"SELECT * FROM [Sheet1$]"
but I can't find what Sheet1$ exactly is. Someone could explain me the right query?
2) To access to each element of the table (it would contain only the third column) and save it in a string variable what I have to do?
Thanks a lot!
To get the sheetnames, you can use the default getschema functionality from the data providers (connection.getschema).
Without column headers (HDR=NO), the columns are named F1,F2,etc., so you for the third field, you could query on F3. If you want to be completely sure, you can also use getschema to get the column names of the sheet/table found with the first getschema.
Finally, to get the values in a string list, you can use a bit of Linq (see the stringlist in the example). Not sure if you meant in a single string value, but if that's the case, you can use a string.join on the linq select.
Combined code starting form connection opening:
conn.Open();
var tableschema = conn.GetSchema("Tables");
var firstsheet = tableschema.Rows[0]["TABLE_NAME"].ToString();
string name_query = "SELECT F3 FROM [" + firstsheet + "]";
OleDbDataAdapter da = new OleDbDataAdapter(name_query, conn);
da.Fill(table);
conn.Close();
j = table.Rows.Count;
var stringlist = table.Rows.Cast<DataRow>().Select(dr => dr[0].ToString()).ToList();

Categories

Resources