How to read an Excel spreadsheet in c# quickly - c#

I am using Microsoft.Office.Interop.Excel to read a spreadsheet that is open in memory.
gXlWs = (Microsoft.Office.Interop.Excel.Worksheet)gXlApp.ActiveWorkbook.ActiveSheet;
int NumCols = 7;
string[] Fields = new string[NumCols];
string input = null;
int NumRow = 2;
while (Convert.ToString(((Microsoft.Office.Interop.Excel.Range)gXlWs.Cells[NumRow, 1]).Value2) != null)
{
for (int c = 1; c <= NumCols; c++)
{
Fields[c-1] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)gXlWs.Cells[NumRow, c]).Value2);
}
NumRow++;
//Do my other processing
}
I have 180,000 rows and this turns out be very slow. I am not sure the "Convert" is efficient. Is there anyway I could do this faster?
Moon

Hi I found a very much faster way.
It is better to read the entire data in one go using "get_range". This loads the data into memory and I can loop through that like a normal array.
Microsoft.Office.Interop.Excel.Range range = gXlWs.get_Range("A1", "F188000");
object[,] values = (object[,])range.Value2;
int NumRow=1;
while (NumRow < values.GetLength(0))
{
for (int c = 1; c <= NumCols; c++)
{
Fields[c - 1] = Convert.ToString(values[NumRow, c]);
}
NumRow++;
}

There are several options - all involve some additional library:
OpenXML 2.0 (free library from MS) can be used to read/modify the content of an .xlsx so you can do with it what you want
some (commercial) 3rd-party libraries come with grid controls allowing you to do much more with excel files in your application (be it Winforms/WPF/ASP.NET...) like SpreadsheetGear, Aspose.Cells etc.

I am not sure the "Convert" is efficient. Is there anyway I could do
this faster?
What makes you believe this? I promise you that Convert.ToString() is the most effective method in the code you posted. Your problem is that your looping through 180,000 records in an excel document...
You could split the work up since you know the number of row this is trival to do.
Why are you coverting Value2 to a string exactly?

I found really fast way to read excel in my specific way. I need to get it as a two dimensional array of string. With really big excel, it took about one hour in old way. In this way, I get my values in 20sec.
I am using this nugget: https://reposhub.com/dotnet/office/ExcelDataReader-ExcelDataReader.html
And here is my code:
DataSet result = null;
//https://reposhub.com/dotnet/office/ExcelDataReader-ExcelDataReader.html
using (var stream = File.Open(path, FileMode.Open, FileAccess.Read))
{
// Auto-detect format, supports:
// - Binary Excel files (2.0-2003 format; *.xls)
// - OpenXml Excel files (2007 format; *.xlsx)
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
result = reader.AsDataSet();
}
}
foreach (DataTable table in result.Tables)
{
if (//my conditions)
{
continue;
}
var rows = table.AsEnumerable().ToArray();
var dataTable = new string[table.Rows.Count][];//[table.Rows[0].ItemArray.Length];
Parallel.For(0, rows.Length, new ParallelOptions { MaxDegreeOfParallelism = 8 },
i =>
{
var row = rows[i];
dataTable[i] = row.ItemArray.Select(x => x.ToString()).ToArray();
});
importedList.Add(dataTable);
}

I guess it's not the Convert the source of "slowing"...
Actually, retrieving cell values is very slow.
I think this conversion is not necessary:
(Microsoft.Office.Interop.Excel.Range)gXlWs
It should work without that.
And you can ask directly:
gXlWs.Cells[NumRow, 1].Value != null
Try to move the entire range or, at least, the entire row to an object Matrix and work with it instead of the range itself.

Use the OleDB Method. That is the fastest as follows;
string con =
#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\temp\test.xls;" +
#"Extended Properties='Excel 8.0;HDR=Yes;'";
using(OleDbConnection connection = new OleDbConnection(con))
{
connection.Open();
OleDbCommand command = new OleDbCommand("select * from [Sheet1$]", connection);
using(OleDbDataReader dr = command.ExecuteReader())
{
while(dr.Read())
{
var row1Col0 = dr[0];
Console.WriteLine(row1Col0);
}
}
}

Related

How to Concatenate an Excel File Using a DataTable?

I have a DataTable being generated using the C# NI DAQmx code. I want to take this DataTable and put it in an excel file when a CheckBox is checked. The DAQmx code records this data 'x' number of samples at a time. When this number is high, the program is slow, but it still works. I want to record a low number of samples at a time, and then save that data into an excel file.
In my current code, the data in the excel file is constantly overwritten. This is not desirable, as I need all recorded data.
Currently the data will actively record when the box is checked, but it will not concatenate. I have tried many searches and explored many methods for this, but I haven't quite been able to adapt anything for my needs.
Relevant code will be included below. Any help is appreciated, thanks.
Note: Data does not have to be a .xlsx file. It can be a .csv
This code is the DataTable generation via DAQmx:
private void DataToDataTable(AnalogWaveform<double>[] sourceArray, ref DataTable dataTable)
{
// Iterate over channels
int currentLineIndex = 0;
string test = currentLineIndex.ToString();
foreach (AnalogWaveform<double> waveform in sourceArray)
{
for (int sample = 0; sample < waveform.Samples.Count; ++sample)
{
if (sample == 50)
break;
dataTable.Rows[sample][currentLineIndex] = waveform.Samples[sample].Value;
}
currentLineIndex++;
}
}
public void InitializeDataTable(AIChannelCollection channelCollection, ref DataTable data)
{
int numOfChannels = channelCollection.Count;
data.Rows.Clear();
data.Columns.Clear();
dataColumn = new DataColumn[numOfChannels];
int numOfRows = 50;
for (int currentChannelIndex = 0; currentChannelIndex < numOfChannels; currentChannelIndex++)
{
dataColumn[currentChannelIndex] = new DataColumn()
{
DataType = typeof(double),
ColumnName = channelCollection[currentChannelIndex].PhysicalName
};
}
data.Columns.AddRange(dataColumn);
for (int currentDataIndex = 0; currentDataIndex < numOfRows ; currentDataIndex++)
{
object[] rowArr = new object[numOfChannels];
data.Rows.Add(rowArr);
}
}
This is my current method of saving to an Excel file:
private void Excel_cap_CheckedChanged(object sender, EventArgs e)
{
int i = 0;
for (excel_cap.Checked = true; excel_cap.Checked == true; i ++) {
{
StringBuilder sb = new StringBuilder();
IEnumerable<string> columnNames = dataTable.Columns.Cast<DataColumn>().
Select(column => column.ColumnName);
sb.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in dataTable.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field => field.ToString());
sb.AppendLine(string.Join(",", fields));
}
File.AppendAllText(filename_box.Text, sb.ToString());
}
}
}
Since you mentioned it does not have to be Excel, it could be a CSV, then you can use your CSV code but change the File.WriteAllText line to File.AppendAllText which will append the text rather than replacing the existing file. AppendAllText will create the file if it doesn't exist.
File.AppendAllText("test.csv", sb.ToString());
Are you sure you are using EPPlus? This CreateExcelFile looks like it is a copied code snippet.
With EPPlus, this would be as easy as
using (var package = new ExcelPackage(new FileInfo(#"a.xslx")))
{
if (!package.Workbook.Worksheets.Any())
package.Workbook.Worksheets.Add("sheet");
var sheet = package.Workbook.Worksheets.First();
var appendRow = (sheet.Dimension?.Rows ?? 0) + 1;
sheet.Cells[appendRow, 1].LoadFromDataTable(new DataTable(), false);
package.SaveAs(new FileInfo(#"a.xslx"));
}
It looks like you have some objects, then convert them to DataTable and then write them to Excel/CSV. If you skip the to DataTable conversion, you'll speed things up. EPPlus has LoadFromCollection which may just work with your AnalogWaveform<double>.
Shameless advertisement: I got these snippets from my blog post about EPPlus.

export data from CSV to datatable in c#

I am using below code to export data from a csv file to datatable.
As the values are of mixed text i.e. both numbers and Alphabets, some of the columns are not getting exported to Datatable.
I have done some research here and found that we need to set ImportMixedType = Text and TypeGuessRows = 0 in registry which even did not solve the problem.
Below code is working for some files even with mixed text.
Could someone tell me what is wrong with below code. Do I miss some thing here.
if (isFirstRowHeader)
{
header = "Yes";
}
using (OleDbConnection connection = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly +
";Extended Properties=\"text;HDR=" + header + ";FMT=Delimited\";"))
{
using (OleDbCommand command = new OleDbCommand(sql, connection))
{
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
adapter.Fill(table);
}
connection.Close();
}
}
for comma delimited file this worked for me
public DataTable CSVtoDataTable(string inputpath)
{
DataTable csvdt = new DataTable();
string Fulltext;
if (File.Exists(inputpath))
{
using (StreamReader sr = new StreamReader(inputpath))
{
while (!sr.EndOfStream)
{
Fulltext = sr.ReadToEnd().ToString();//read full content
string[] rows = Fulltext.Split('\n');//split file content to get the rows
for (int i = 0; i < rows.Count() - 1; i++)
{
var regex = new Regex("\\\"(.*?)\\\"");
var output = regex.Replace(rows[i], m => m.Value.Replace(",", "\\c"));//replace commas inside quotes
string[] rowValues = output.Split(',');//split rows with comma',' to get the column values
{
if (i == 0)
{
for (int j = 0; j < rowValues.Count(); j++)
{
csvdt.Columns.Add(rowValues[j].Replace("\\c",","));//headers
}
}
else
{
try
{
DataRow dr = csvdt.NewRow();
for (int k = 0; k < rowValues.Count(); k++)
{
if (k >= dr.Table.Columns.Count)// more columns may exist
{ csvdt .Columns.Add("clmn" + k);
dr = csvdt .NewRow();
}
dr[k] = rowValues[k].Replace("\\c", ",");
}
csvdt.Rows.Add(dr);//add other rows
}
catch
{
Console.WriteLine("error");
}
}
}
}
}
}
}
return csvdt;
}
The main thing that would probably help is to first stop using OleDB objects for reading a delimited file. I suggest using the 'TextFieldParser' which is what I have successfully used for over 2 years now for a client.
http://www.dotnetperls.com/textfieldparser
There may be other issues, but without seeing your .CSV file, I can't tell you where your problem may lie.
The TextFieldParser is specifically designed to parse comma delimited files. The OleDb objects are not. So, start there and then we can determine what the problem may be, if it persists.
If you look at an example on the link I provided, they are merely writing lines to the console. You can alter this code portion to add rows to a DataTable object, as I do, for sorting purposes.

Export huge CVS file to database c# [duplicate]

This question already has answers here:
Load very big CSV-Files into s SQL-Server database
(3 answers)
Closed 9 years ago.
I have a big CSV file with 10 million entries, and i need to export it to SQL using C#. I'm a newby and i really don't know how to write this.
I have something like this so far:
private static void ExportToDB()
{
SqlConnection con = new SqlConnection(#"Data Source=SHAWHP\SQLEXPRESS;Initial Catalog=FOO;Persist Security Info=True;User ID=sa");
string filepath = #"E:\Temp.csv";
StreamReader sr = new StreamReader(filepath);
string line = sr.ReadLine();
string[] value = line.Split(',');
DataTable dt = new DataTable();
DataRow row;
foreach (string dc in value)
{
dt.Columns.Add(new DataColumn(dc));
}
while ( !sr.EndOfStream )
{
value = sr.ReadLine().Split(',');
if(value.Length == dt.Columns.Count)
{
row = dt.NewRow();
row.ItemArray = value;
dt.Rows.Add(row);
}
}
SqlBulkCopy bc = new SqlBulkCopy(con.ConnectionString, SqlBulkCopyOptions.TableLock);
bc.DestinationTableName = "tblparam_test";
bc.BatchSize = dt.Rows.Count;
con.Open();
bc.WriteToServer(dt);
bc.Close();
con.Close();
}
.
And it gives me an error, saying this:
An unhandled exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll
How can i fix it? Or is there another way?
You can't use such approach because string.Split creates lots of arrays that multiply amount of memory. Suppose you have 10 columns. After split you will have Array length of 10 and 10 string = 11 objects. Each of them has 8 or 16 bytes extra memory(object sync root and etc). So, memory overhead is 88 bytes for each string. 10 KK lines will consume at least 880KK memory- and add to this number size of your file and you will have the value of 1gb. This is not all, DateRow is quite heavy structure, so, you should add 10KK of data rows. And this is not all - DataTable of size 10KK elements will have size more than 40mb.
So, expected required size is more than 1Gb.
For х32 process .Net can't easily use more then 1Gb memory. Theoretically it has 2 gigs, but this is just theoretically, because everything consumes memory - assemblies, native dlls and another objects, UI and etc.
The solutions is to use х64 process or read-write in chunks like below
private static void ExportToDB()
{
string filepath = #"E:\Temp.csv";
StreamReader sr = new StreamReader(filepath);
string line = sr.ReadLine();
string[] value = line.Split(',');
DataTable dt = new DataTable();
DataRow row;
foreach (string dc in value)
{
dt.Columns.Add(new DataColumn(dc));
}
int i = 1000; // chunk size
while ( !sr.EndOfStream )
{
i--
value = sr.ReadLine().Split(',');
if(value.Length == dt.Columns.Count)
{
row = dt.NewRow();
row.ItemArray = value;
dt.Rows.Add(row);
}
if(i > 0)
continue;
WriteChunk(dt);
i = 1000;
}
WriteChunk(dt);
}
void WriteChunk(DataTable dt)
{
SqlConnection con = new SqlConnection(#"Data Source=SHAWHP\SQLEXPRESS;Initial Catalog=FOO;Persist Security Info=True;User ID=sa");
using(SqlBulkCopy bc = new SqlBulkCopy(con.ConnectionString, SqlBulkCopyOptions.TableLock))
{
bc.DestinationTableName = "tblparam_test";
bc.BatchSize = dt.Rows.Count;
using(con.Open())
{
bc.WriteToServer(dt);
}
}
dt.Rows.Clear()
}
If you can get the file to the server. I would use bulk insert server-side.
BULK Insert CSV
Regards.
Taken from MSDN:
In relation to .ReadLine()
If the current method throws an OutOfMemoryException, the reader's position in the underlying Stream object is advanced by the number of characters the method was able to read, but the characters already read into the internal ReadLine buffer are discarded. If you manipulate the position of the underlying stream after reading data into the buffer, the position of the underlying stream might not match the position of the internal buffer. To reset the internal buffer, call the DiscardBufferedData method; however, this method slows performance and should be called only when absolutely necessary.

export data to excel file in an asp.net application

Can someone provide a link with a tutorial about exporting data to an excel file using c# in an asp.net web application.I searched the internet but I didn't find any tutorials that will explain how they do it.
You can use Interop http://www.c-sharpcorner.com/UploadFile/Globalking/datasettoexcel02272006232336PM/datasettoexcel.aspx
Or if you don't want to install Microsoft Office on a webserver
I recommend using CarlosAg.ExcelXmlWriter which can be found here: http://www.carlosag.net/tools/excelxmlwriter/
code sample for ExcelXmlWriter:
using CarlosAg.ExcelXmlWriter;
class TestApp {
static void Main(string[] args) {
Workbook book = new Workbook();
Worksheet sheet = book.Worksheets.Add("Sample");
WorksheetRow row = sheet.Table.Rows.Add();
row.Cells.Add("Hello World");
book.Save(#"c:\test.xls");
}
}
There is a easy way to use npoi.mapper with just below 2 lines
var mapper = new Mapper();
mapper.Save("test.xlsx", objects, "newSheet");
Pass List to below method, that will convert the list to buffer and then return buffer, a file will be downloaded.
List<T> resultList = New List<T>();
byte[] buffer = Write(resultList, true, "AttendenceSummary");
return File(buffer, "application/excel", reportTitle + ".xlsx");
public static byte[] Write<T>(IEnumerable<T> list, bool xlsxExtension = true, string sheetName = "ExportData")
{
if (list == null)
{
throw new ArgumentNullException("list");
}
XSSFWorkbook hssfworkbook = new XSSFWorkbook();
int Rowspersheet = 15000;
int TotalRows = list.Count();
int TotalSheets = TotalRows / Rowspersheet;
for (int i = 0; i <= TotalSheets; i++)
{
ISheet sheet1 = hssfworkbook.CreateSheet(sheetName + "_" + i);
IRow row = sheet1.CreateRow(0);
int index = 0;
foreach (PropertyInfo property in typeof(T).GetProperties())
{
ICellStyle cellStyle = hssfworkbook.CreateCellStyle();
IFont cellFont = hssfworkbook.CreateFont();
cellFont.Boldweight = (short)NPOI.SS.UserModel.FontBoldWeight.Bold;
cellStyle.SetFont(cellFont);
ICell cell = row.CreateCell(index++);
cell.CellStyle = cellStyle;
cell.SetCellValue(property.Name);
}
int rowIndex = 1;
// int rowIndex2 = 1;
foreach (T obj in list.Skip(Rowspersheet * i).Take(Rowspersheet))
{
row = sheet1.CreateRow(rowIndex++);
index = 0;
foreach (PropertyInfo property in typeof(T).GetProperties())
{
ICell cell = row.CreateCell(index++);
cell.SetCellValue(Convert.ToString(property.GetValue(obj)));
}
}
}
MemoryStream file = new MemoryStream();
hssfworkbook.Write(file);
return file.ToArray();
}
You can try the following links :
http://www.codeproject.com/Articles/164582/8-Solutions-to-Export-Data-to-Excel-for-ASP-NET
Export data as Excel file from ASP.NET
http://codeissue.com/issues/i14e20993075634/how-to-export-gridview-control-data-to-excel-file-using-asp-net
I've written a C# class, which lets you write your DataSet, DataTable or List<> data directly into a Excel .xlsx file using the OpenXML libraries.
http://mikesknowledgebase.com/pages/CSharp/ExportToExcel.htm
It's completely free to download, and very ASP.Net friendly.
Just pass my C# function the data to be written, the name of the file you want to create, and your page's "Response" variable, and it'll create the Excel file for you, and write it straight to the Page, ready for the user to Save/Open.
class Employee;
List<Employee> listOfEmployees = new List<Employee>();
// The following ASP.Net code gets run when I click on my "Export to Excel" button.
protected void btnExportToExcel_Click(object sender, EventArgs e)
{
// It doesn't get much easier than this...
CreateExcelFile.CreateExcelDocument(listOfEmployees, "Employees.xlsx", Response);
}
(I work for a finanical company, and we'd be lost without this functionality in every one of our apps !!)

Export data from SQL and write to etxt file (No BCP or SP can be used)

So I'm looking for an easy way to export data from a SQL Server 2000 database and write it to a comma delimited text file. Its one table and only about 1,000 rows. I'm new to C# so please excuse me if this is a stupid question.
This is a very easy task, but you need to learn the SqlClient namespace and the different objects you have at your disposal. You will want to note though that for SQL Server 2000 and lower asynchronous methods are not supported, so they will be all blocking.
Mind you this is a very sketchy example, and I did not test this, but this would be one general approach.
string connectionString = "<yourconnectionstringhere>";
using (SqlConnection connection = new SqlConnection(connectionString)) {
try {
connection.Open();
}
catch (System.Data.SqlClient.SqlException ex) {
// handle
return;
}
string selectCommandText = "SELECT * FROM <yourtable>";
using (SqlDataAdapter adapter = new SqlDataAdapter(selectCommandText, connection)) {
using (DataTable table = new DataTable("<yourtable>")) {
adapter.Fill(table);
StringBuilder commaDelimitedText = new StringBuilder();
commaDelimitedText.AppendLine("col1,col2,col3"); // optional if you want column names in first row
foreach (DataRow row in table.Rows) {
string value = string.Format("{0},{1},{2}", row[0], row[1], row[2]); // how you format is up to you (spaces, tabs, delimiter, etc)
commaDelimitedText.AppendLine(value);
}
File.WriteAllText("<pathhere>", commaDelimitedText.ToString());
}
}
}
Some resources you will want to look into:
SqlConnection Class; http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection(v=VS.90).aspx
SqlDataAdapter Class; http://msdn.microsoft.com/en-us/library/ds404w5w(v=VS.90).aspx
SqlDataAdapter.Fill Method; http://msdn.microsoft.com/en-us/library/905keexk(v=VS.90).aspx
DataTable Class; http://msdn.microsoft.com/en-us/library/system.data.datatable.aspx
I'm also not sure what your requirements are, or why you have to do this task, but there are also probably quite a lot of tools out there that can already do this for you (if this is a one time thing), because this is not an uncommon task.
This is no different in C# than in any other language with the tools you need.
1) Query DB and store data in collection
2) Write collection to file in CSV format
Are you doing this in a Windows Form application? If you have bound data to a control such as a DataGridView this is pretty easy to do. You can just loop through the control and write the file that way. I like this because if you implemented filtering mechanisms in your application, whatever a user filtered can be written to the file. This is how I have done this before. You should be able to tweak this if you are using some sort of collection without much trouble.
private void exportCsv()
{
SaveFileDialog saveFile = new SaveFileDialog();
createSaveDialog(saveFile, ".csv", "CSV (*csv)|*.csv)");
TextWriter writer = new StreamWriter(saveFile.FileName);
int row = dataGridView1.Rows.Count;
int col = dataGridView1.Columns.Count;
try
{
if (saveFile.FileName != "")
{
for (int i = 0; i < dataGridView1.Columns.Count; i++)
{
writer.Write(dataGridView1.Columns[i].HeaderText + ",");
}
writer.Write("\r\n");
for (int j = 0; j < row - 1; j++)
{
for (int k = 0; k < col - 1; k++)
{
writer.Write(dataGridView1.Rows[j].Cells[k].Value.ToString() + ",");
}
writer.Write("\r\n");
}
}
MessageBox.Show("File Sucessfully Created!", "File Saved", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
}
catch
{
MessageBox.Show("File could not be created.", "Save Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
}
finally
{
writer.Close();
saveFile.Dispose();
}
}

Categories

Resources