I am trying to import a excel spreadsheet into a an array of datatables. Each table will be a sheet from the spreadsheet. Right now I am seeing that each table contains the information from all sheets. I am thinking this part is not working correctly.
dataSet.Clear();
Let me know if you can see what I am doing wrong.
Here is the rest of the code.
public DataTable[] ReadDoc()
{
string filename = #"C:\Documents and Settings\user\Desktop\Test.xlsx";
DataTable dt = null;
string connectionString = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 8.0;HDR=YES\";", filename);
OleDbConnection connection = new OleDbConnection(connectionString);
DataSet dataSet = new DataSet();
DataSet finalDataSet = new DataSet();
DataTable[] table = new DataTable[3];
connection.Open();
dt = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (dt == null)
{
return null;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
foreach (DataRow row in dt.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
// Loop through all of the sheets if you want too...
for (int j = 0; j < excelSheets.Length; j++)
{
string query = String.Format("SELECT * FROM [" + excelSheets[j] + "]");
dataSet.Clear();
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(query, connectionString);
dataAdapter.Fill(dataSet);
table[j] = dataSet.Tables[0];
}
return table;
}
Thanks for the help.
The problem here is your dataSet, is declared outsife the for. Each datatable array item is getting the same information. dataSet.Tables[0]; You must declare inside the for. Each iteration store different information.
for (int j = 0; j < excelSheets.Length; j++)
{
DataSet dataSet = new DataSet();
string query = String.Format("SELECT * FROM [" + excelSheets[j] + "]");
.....
}
Related
I need to read a single cell from an xsl excel file to a string withing the web application i am building. I was previously pulling cell ranges from the file to a table using the following code:
string PullFromExcell(string CellNo)
{
string cell;
string properties = String.Format(#"Provider = Microsoft.Jet.OLEDB.4.0; Data Source = C:\Users\User\Desktop\file.xls; Extended Properties = 'Excel 8.0;'");
using (OleDbConnection conn = new OleDbConnection(properties))
{
string worksheet = "Sheet";
conn.Open();
DataSet ds = new DataSet();
using (OleDbDataAdapter da = new OleDbDataAdapter("SELECT * FROM [" + worksheet + "$" + CellNo + "]", properties))
{
DataTable dt = new DataTable();
cell = dt.ToString();
da.Fill(dt);
ds.Tables.Add(dt);
grdComponent.DataSource = dt;
grdComponent.DataBind();
}
}
return cell;
}
How would i send that to a string? The code that i would use when pulling from a database is similar to this:
Sqlstring = "Select data from variable where name = 'fred' and ssn = 1234";
var cmd0 = new SqlCommand(Sqlstring, Class_Connection.cnn);
string Data = cmd0.ExecuteScalar().ToString();
i'm just not sure if any of that is compatible.
After filling DataTable, you can search the row like this:
foreach (DataRow dr in dt.Rows)
{
if (dr["name"] == "fred" && dr["ssn"] == "1234")
{
cell = dr["data"].ToString();
break;
}
}
I read excel but dataGridView show data than lines excel file, So I can't write datagridview.Rowcount(). I use the below given code to read the excel file.
Code:
filePath = txtExcelFile.Text;
string[] fileSpit = filePath.Split('.');
if (filePath.Length > 1 && fileSpit[1] == "xls")
{
connString = "Provider=Microsoft.JET.OLEDB.4.0;Data Source=" + filePath + ";Extended Properties='Excel 8.0;HDR=No'";
}
else
{
connString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties='Excel 12.0;HDR=No'";
}
OleDbCommand cmd = new OleDbCommand(#"Select * from [" +comboBox1.SelectedValue.ToString() + "]", ole);
OleDbDataAdapter oledata = new OleDbDataAdapter();
oledata.SelectCommand = cmd;
DataSet ds = new DataSet();
oledata.Fill(ds);
dataGridView1.DataSource = ds.Tables[0].DefaultView;
Either strip out the blank lines from the data table before assigning it to the grid:
private DataTable StripEmptyRows(DataTable dt)
{
List<int> rowIndexesToBeDeleted = new List<int>();
int indexCount = 0;
foreach(var row in dt.Rows)
{
var r = (DataRow)row;
int emptyCount = 0;
int itemArrayCount = r.ItemArray.Length;
foreach(var i in r.ItemArray) if(string.IsNullOrWhiteSpace (i.ToString())) emptyCount++;
if(emptyCount == itemArrayCount) rowIndexesToBeDeleted.Add(indexCount);
indexCount++;
}
int count = 0;
foreach(var i in rowIndexesToBeDeleted)
{
dt.Rows.RemoveAt(i-count);
count++;
}
return dt;
}
Or do your own row count ignoring blank rows.
I am trying to find a faster way to read an XML file that can be opened in Excel 2010. I cannot immediately read the XML file using readxml method because it contains Workbook, style, cell, data and other tags. So my approach was to open it in Excel then get the data on sheet 2 only. The sample file contains 9,000+ rows and takes about 2mins 49secs to store in a datatable. The actual file has 25,000+ rows. This is what I have tried:
private void bulkInsert()
{
var s = new Stopwatch();
s.Start();
try
{
KillExcel();
GCollector();
Excel.Application app = null;
app = new Excel.Application();
Excel.Worksheet sheet = null;
Excel.Workbook book = null;
book = app.Workbooks.Open(#"my directory for the file");
sheet = (Worksheet)book.Sheets[2];
sheet.Select(Type.Missing);
var xlRange = (Excel.Range)sheet.Cells[sheet.Rows.Count, 1];
int lastRow = (int)xlRange.get_End(Excel.XlDirection.xlUp).Row;
int newRow = lastRow + 1;
var cellrow = newRow;
int columns = sheet.UsedRange.Columns.Count;
Excel.Range test = sheet.UsedRange;
System.Data.DataTable dt = new System.Data.DataTable();
dt.Columns.Add("Node_SegmentName");
dt.Columns.Add("Type");
dt.Columns.Add("Sub-Type");
dt.Columns.Add("Description");
dt.Columns.Add("Parameter_DataIdentifier");
dt.Columns.Add("RuntimeValue");
dt.Columns.Add("Category");
dt.Columns.Add("Result");
dt.TableName = "SsmXmlTable";
//slow part
for (i = 0; i < lastRow; i++)
{
DataRow excelRow = dt.NewRow();
for (int j = 0; j < columns; j++)
{
excelRow[j] = test.Cells[i + 2, j + 1].Value2;
}
dt.Rows.Add(excelRow);
}
dataGridView1.DataSource = dt;
DataSet ds = new DataSet();
ds.Tables.Add(dt);
ds.WriteXml(AppDomain.CurrentDomain.BaseDirectory + String.Format("\\XMLParserOutput{0}.xml", DateTime.Now.ToString("MM-d-yyyy")));
DataSet reportData = new DataSet();
reportData.ReadXml(AppDomain.CurrentDomain.BaseDirectory + String.Format("\\XMLParserOutput{0}.xml", DateTime.Now.ToString("MM-d-yyyy")));
SqlConnection connection = new SqlConnection("Data Source=YOURCOMPUTERNAME\\SQLEXPRESS;Initial Catalog=YOURDATABASE;Integrated Security=True;Connect Timeout=0");
connection.Open();
SqlBulkCopy sbc = new SqlBulkCopy(connection);
sbc.DestinationTableName = "Test";
sbc.WriteToServer(reportData.Tables["SsmXmlTable"]);
connection.Close();
s.Stop();
var duration = s.Elapsed;
MessageBox.Show(duration.ToString() + " bulk insert way");
MessageBox.Show(ds.Tables["SsmXmlTable"].Rows.Count.ToString());//439 rows
}
catch (Exception ex)
{
KillExcel();
GCollector();
MessageBox.Show(ex.ToString() + i.ToString());
}
}
Without the reading from Excel part, the insertion of data using bulk copy only takes a couple of seconds (0.5secs for 449 rows).
For others who are encountering the same issue, what I did was:
save the xml as an xlsx file
use oledb to read the xlsx file
store in dataset using OleDbAdapter (Fill() method)
bulk insert
Here is the code that I used to do this (change the connection string):
Stopwatch s = new Stopwatch();
s.Start();
string sSheetName = null;
string sConnection = null;
System.Data.DataTable sheetData = new System.Data.DataTable();
System.Data.DataTable dtTablesList = default(System.Data.DataTable);
OleDbConnection oleExcelConnection = default(OleDbConnection);
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + #"C:\Users\YOURUSERNAME\Documents\Visual Studio 2012\Projects\TestXmlParser\TestXmlParser\bin\Debug\ConsolidatedSSMFiles.xlsx" + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1\"";
oleExcelConnection = new OleDbConnection(sConnection);
oleExcelConnection.Open();
dtTablesList = oleExcelConnection.GetSchema("Tables");
if (dtTablesList.Rows.Count > 0)
{
sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
}
dtTablesList.Clear();
dtTablesList.Dispose();
if (!string.IsNullOrEmpty(sSheetName))
{
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [TEST$]", oleExcelConnection);
sheetAdapter.Fill(sheetData);
} s.Stop();
var duration = s.Elapsed;
oleExcelConnection.Close();
dataGridView1.DataSource = sheetData;
MessageBox.Show(sheetData.Rows.Count.ToString()+"rows - "+ duration.ToString());
This reads 25000+ rows of excel data to a datable in approx. 1.9 to 2.0 seconds.
I extract data from all sheets in a workbook using the following code :
foreach (var sheetName in GetExcelSheetNames(connectionString))
{
if (sheetName.Contains("_"))
{
}
else
{
using (OleDbConnection con = new OleDbConnection(connectionString))
{
var dataTable = new DataTable();
string query = string.Format("SELECT * ,{0} as sheetName FROM [{0}]", sheetName);
con.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter(query, con);
try
{
adapter.Fill(dataTable);
ds.Tables.Add(dataTable);
}
catch { }
}
}
I can't just figure how data are stocked in DataTable : sheetname is added as column ? how can I extract it ?
foreach (DataTable dt in ds.Tables)
{
using (SqlConnection con = new SqlConnection(consString))
{
con.Open();
for (int i = 0; i < dt.Rows.Count; i++)
{
for (int j = 0; j < dt.Columns.Count; j ++)
{
//what should I write here ?
}
}
}
In order to get the sheet name, using oledb, you will need to use code that looks something like this (thanks to this SO post and answer):
DataTable dtSheets = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
List<string> sheets= new List<string>();
foreach (DataRow dr in dtSheets.Rows)
{
if (dr["TABLE_NAME"].ToString().Contains("$"))//checks whether row contains '_xlnm#_FilterDatabase' or sheet name(i.e. sheet name always ends with $ sign)
{
sheets.Add(dr["TABLE_NAME"].ToString());
}
}
Below is how you access the values from a datatable:
var someValue = dt.Rows[i][j]
You need to get the item at the column index (j) of the row, at the row index (i), of the current datatable (dt).
Conversely, you can use the name of the column as well.
var someValue = dt.Rows[i]["columnName"]
assuming dt is your datatable variable,
do dt.Rows[row index][column index]
like
dt[2][4] -> will reference the 2nd row, 4th cell
I am not sure, but perhaps the sheetname might be stored at dt.TableName
I have a excel sheet with two tabs so i want to get a row from one tab and insert into another,i thought it would be same like in sqlserver or mysql . Just select and insert..
I am using this query but it says syntax error not sure what is wrong in it.
testCommand.CommandText = "Insert into [ActiveLicenses$]( Select * from [companies$]
where [License Number] = '" + lnumber + "')";
testCommand.ExecuteNonQuery();
UPDATE
Is there any way to delete the rows directly from excel sheet?
You can use SQL to extract the data from Excel:
using (OleDbDataAdapter da = new OleDbDataAdapter(
"SELECT " + columns + " FROM [" + worksheetName + "$]", conn))
{
DataTable dt = new DataTable(tableName);
da.Fill(dt);
ds.Tables.Add(dt);
}
Unfortunately inserting into excel doesn't work this way. I am pretty sure you cant specify a cell to write to using OleDb Insert Command, it will automatically go to the next open row in the specified column. You can workaround it with an update statement:
sql = "Update [Sheet1$A1:A10] SET A10 = 'YourValue'";
myCommand.CommandText = sql;
myCommand.ExecuteNonQuery();
Personally I would use VSTO rather than oleDB. Once you have extracted the cell simply open up the spreadsheet with code and insert the data:
Excel.Workbook wb = xlApp.Workbooks.Open(filePath);
rng = wb.Range["A1"];
rng.Value2 = "data";
A faster method.
I take all the licenses into a DataTable and remove the ones not required takes less than 1 minute. and then simply export DataTable to Csv so i have the file ready in less than 1 minute.
Sample below:
static List<string> licensecAll = new List<string>();
DataTable dt = new DataTable();
OleDbDataAdapter dp = new OleDbDataAdapter("select * from [companies$]", testCnn);
dp.Fill(dt);
if (dt.Rows.Count > 0)
{
for (int i = dt.Rows.Count-1; i >= 0; i--)
{
string lnum = dt.Rows[i][0].ToString();
Console.WriteLine("LICENSE NUMBER" + lnum);
if (!licensecAll.Contains(lnum))
{
Console.WriteLine("ROW REMOVED");
dt.Rows.RemoveAt(i);
}
}
}
Then simply run datatable to csv....
public static void DataTable2CSV(DataTable table, string filename, string seperateChar)
{
StreamWriter sr = null;
try
{
sr = new StreamWriter(filename);
string seperator = "";
StringBuilder builder = new StringBuilder();
foreach (DataColumn col in table.Columns)
{
builder.Append(seperator).Append(col.ColumnName);
seperator = seperateChar;
}
sr.WriteLine(builder.ToString());
foreach (DataRow row in table.Rows)
{
seperator = "";
builder = new StringBuilder();
foreach (DataColumn col in table.Columns)
{
builder.Append(seperator).Append(row[col.ColumnName]);
seperator = seperateChar;
}
sr.WriteLine(builder.ToString());
}
}
finally
{
if (sr != null)
{
sr.Close();
}
}
}