I need a way to read a Excel file from a stream. It doesn't seem to work with the ADO.NET way of doing things.
The scenario is that a user uploads a file through a FileUpload and i need to read some values from the file and import to a database.
For several reasons I can't save the file to disk, and there is no reason to do so either.
So, anyone know of a way to read a Excel file from a FileUpload stream?
It seems i found a soultion to the problem myself.
http://www.codeplex.com/ExcelDataReader
This library seems to work nicely and it takes a stream to read the excel file.
ExcelDataReader reader = new ExcelDataReader(ExcelFileUpload.PostedFile.InputStream);
This can be done easily with EPPlus.
//the excel sheet as byte array (as example from a FileUpload Control)
byte[] bin = FileUpload1.FileBytes;
//gen the byte array into the memorystream
using (MemoryStream ms = new MemoryStream(bin))
using (ExcelPackage package = new ExcelPackage(ms))
{
//get the first sheet from the excel file
ExcelWorksheet sheet = package.Workbook.Worksheets[1];
//loop all rows in the sheet
for (int i = sheet.Dimension.Start.Row; i <= sheet.Dimension.End.Row; i++)
{
//loop all columns in a row
for (int j = sheet.Dimension.Start.Column; j <= sheet.Dimension.End.Column; j++)
{
//do something with the current cell value
string currentCellValue = sheet.Cells[i, j].Value.ToString();
}
}
}
SpreadsheetGear can do it:
SpreadsheetGear.IWorkbook workbook = SpreadsheetGear.Factory.GetWorkbookSet().Workbooks.OpenFromStream(stream);
You can try it for yourself with the free evaluation.
Disclaimer: I own SpreadsheetGear LLC
Infragistics has an excel component that can read an excel file from a stream.
I'm using it in a project here and it works well.
Also the open source myXls component could easily be modified to support this. The XlsDocument contstructor only supports loading from a file given by a file name, but it works by creating a FileStream and then reading the Stream, so changing it to support loading from streams should be trivial.
Edit:
I see that you found a solution but I just wanted to note that I updated the source code for the component so that it now can read an excel file directly from a stream. :-)
I use ClosedXML nuget package to read excel content from stream. It has a constructor overload in XLWorkbook class which takes stream pointing to an excel file (aka workbook).
imported namespace at the top of your code file:
using ClosedXML.Excel;
Source code:
var stream = /*obtain the stream from your source*/;
if (stream.Length != 0)
{
//handle the stream here
using (XLWorkbook excelWorkbook = new XLWorkbook(stream))
{
var name = excelWorkbook.Worksheet(1).Name;
//do more things whatever you like as you now have a handle to the entire workbook.
var firstRow = excelWorkbook.Worksheet(1).Row(1);
}
}
Related
I have a project where my goal is to produce an .xlsm Excel spreadsheet using .NET and the EEPlus 5.8.14 Excel Spreadsheet library. I can do this using EEPlus's documented techniques, (though some of these I cannot get to work). As I was working on this, I realized that what my code needed to do was relatively small, and it made sense to use an existing .xlsm file as a template and just make changes to what I needed to change using EEPlus.
So now I am including the .xlsm file as a resource compiled into the assembly. This works great, and I can read the file from the resources and produce it from my controller. But once read, this data inside EPPlus seems to be read-only. So while this produces an Excel file:
public ActionResult ExcelFile(){
const string ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Byte[] bytes = Properties.Resources.AssetsEntry;
string fstem = Path.GetRandomFileName();
int unique = 0;
string filePath = String.Format("{0}#AutoGen_{1}_{2}.{3}", Path.GetTempPath(), fstem, ++unique, "xlsm");
var outStream = System.IO.File.OpenWrite(filePath);
var writer = new BinaryWriter(outStream);
writer.Write(bytes);
outStream.Close();
ExcelPackage excelPackage = new ExcelPackage(filePath);
var sheet = excelPackage.Workbook.Worksheets[1];
//place where I might want to change data
//sheet.Cells["B3"].Value = "testing";
var excelData = excelPackage.GetAsByteArray();
var fileName = "ExcelFile.xlsm";
return File(excelData, ContentType, fileName);
}
If I try to uncomment out the second commented-out line, that code fails to change the resulting Excel spreadsheet (though there is no error). How do I go about reading in an Excel spreadsheet and making changes using EEPlus?
UPDATE: I can add new worksheets to an uploaded spreadsheet, and I can alter those added sheets. But I cannot alter data on uploaded worksheets. Fortunately, for this particular project, that is acceptable. But it would be frustrating if I wanted to be able to set up a worksheet in Excel and then populate it programmatically.
When I use c# to read a excel which I use excel to open it, there is a error The process cannot access the file 'xxxx' because it is being used by another process.
Is there any way to do this, I don't wan't open it for every time.
I'm not sure how you're trying to read the xlsx file, so it might not be possible using the library or tool that you are currently using.
It is possible to open the file stream for reading while Excel has the file open. However, you should be aware that it is also inherently dangerous. The process reading the excel file expects to be reading a consistent view of the file. If Excel decides to write to that same file while it is being read, then it is almost assured that the reading process will fail in some catastrophic way. Since an .xlsx file is just a zip file, the most likely result will be a failure in accessing or decompressing one of the .xlsx entries.
Here is an example of how you can do this using a library I maintain: Sylvan.Data.Excel.
var file = "myfile.xlsx";
// DANGEROUS: open it for reading, but allow other processes to write to it at the same time.
var stream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
// gets the workbook type from the filename extension
var type = ExcelDataReader.GetWorkbookType(file);
// create the data reader
var reader = ExcelDataReader.Create(stream, type);
// loop over rows
while (reader.Read())
{
// write out the data in the row
Console.WriteLine("Row: " + reader.RowNumber);
for (int i = 0; i < reader.RowFieldCount; i++)
{
Console.WriteLine(reader.GetString(i));
}
Console.WriteLine();
}
I want to generate big xlsx file, but don't want to keep it in memory of the server and get OutOfMemoryException .
So I read the data from database page by page, generate rows with OpenXmlWriter and send it part by part to the client:
// I use `MemoryStream` as the buffer
OpenXmlWriter = OpenXmlWriter.Create(OutputStream);
OpenXmlWriter.WriteStartElement(new Worksheet());
OpenXmlWriter.WriteStartElement(new SheetData());
foreach(var row in rows)
{
//...write cells with OpenXmlWriter and then
OutputStream.Position = 0;
var buffer = new byte[OutputStream.Length];
OutputStream.Read(buffer, 0, (int)OutputStream.Length);
FlushCalback(buffer);
OutputStream.SetLength(0);
System.Web.HttpContext.Current.Response.BinaryWrite(dataBuffer);
}
But after downloading I found, that it generates only xml markup, while xlsx is actualy "package". I can't find any example how to do that. Is there any solution? Or another libraries ?
UPDATE:
SpreadsheetDocument could help, but it writes ALL the data to the stream after calling Save() method. And it will rewrite everithing after each call.
SpreadsheetDocument = SpreadsheetDocument.Create(OutputStream, SpreadsheetDocumentType.Workbook);
//... generate rows
SpreadsheetDocument.Save();
So I am able to open and dig through an xls (Excel 97-2003) file. However, the problem is when I try to save it. After I save it, and it succeeds, I then go and manually open the Excel file and get an error saying that it cannot open the Excel file and that it is corrupt. This happens regardless of whether I make any changes or not.
I am still able to, within the program, open the excel file and read through the data.
I am using NPOI 2.2.1 and 2.3.0 (which I installed via Nuget). Both versions have the same results.
string excelLocation = settings.GetExcelDirectory() + week.ExcelLocation;
HSSFWorkbook wbXLS;
// Try to open and read existing workbook
using (FileStream stream = new FileStream(excelLocation, FileMode.Open, FileAccess.Read))
{
wbXLS = new HSSFWorkbook(stream);
}
ISheet sheet = wbXLS.GetSheet("Schedule");
using (FileStream stream = new FileStream(excelLocation, FileMode.Create, FileAccess.Write))
{
wbXLS.Write(stream);
}
Do you have multi-line cell contents (with a line break)?
I just came across such a problem with my column headers in row 0.
Excel encodes a line break by replacing it with _x000a_ (i.e. line1_x000a_line2), NPOI doesn't do that.
Tried with nuget 2.3.0 and xlsx file, but might also be helpful in your case.
I worked around (and verified the cause) by replacing line feeds before saving:
// Encode LineFeeds in column headers (row 0)
IRow rowColHeaders = sheet.GetRow(0);
foreach (ICell cell in rowColHeaders.Cells)
{
string content = cell.StringCellValue;
if (content.Contains("\n"))
cell.SetCellValue(content.Replace("\n", "_x000a_"));
}
try something like
FileStream sw = File.Create(excelLocation);
wbXLS.Write(sw);
sw.Close();
I want to read excel file but in this way is too slow. What pattern should I use to read excel file faster. Should I try csv ?
I am using the following code:
ApplicationClass excelApp = excelApp = new ApplicationClass();
Workbook myWorkBook = excelApp.Workbooks.Open(#"C:\Users\OWNER\Desktop\Employees.xlsx");
Worksheet mySheet = (Worksheet)myWorkBook.Sheets["Sheet1"];
for (int row = 1; row <= mySheet.UsedRange.Rows.Count; row++)
{
for (int col = 1; col <= mySheet.UsedRange.Columns.Count; col++)
{
Range dataRange = (Range)mySheet.Cells[row, col];
Console.Write(String.Format(dataRange.Value2.ToString() + " "));
}
Console.WriteLine();
}
excelApp.Quit();
The reason your program is slow is because you are using Excel to open your Excel files. Whenever you are doing anything with the file you have to do a COM+ interop, which is extremely slow, as you have to pass memory across two different processes.
Microsoft has dropped support for reading .xlsx files using Excel interop. They released the OpenXML library specifically for this reason.
I suggest you use a wrapper library for using OpenXML, since the API is pretty hairy. You can check out this SO for how to use it correctly.
open xml reading from excel file
You're accessing Excel file through excel interop. By doing reads cell by cell you're doing a lot of P/Invoke's which is not very performant.
You can read data in ranges, not cell by cell. This loads the data into memory and you could iterate it much faster. (Eg. try to load column by column.)
BTW: You could use some library instead like http://epplus.codeplex.com which reads excel files directly.
Excel Data Reader
Lightweight and very fast if reading is your only concern.