I have looked at most of the topics on this forum regarding similar questions but haven't found exactly what I am looking for.
I am trying to write a pipeline component for BizTalk 2013 R2 using C# to simply convert an incoming Excel 2010 .xlsx file to it's bare/base XML representation.
I do not want to run any templates against it or XLST transform it or anything like that. I simply just want to return the underlying XML representation of said spreadsheet as is.
It seems like this should be a very easy task but I can't figure out how to do it at all.
Everything I've found requires working with DataTables and looping through rows and cells (via OpenXML) to output a specific XML representation that is more human readable but that isn't what I want.
I want the actual Microsoft XML representation of that spreadsheet.
Any help would be greatly appreciated.
OK, figured it out without having to do any unzipping of the file.
If you use the SAX approach to loading the worksheet into an OpenXmlReader found here:
https://msdn.microsoft.com/en-us/library/office/gg575571(v=office.15).aspx
You can then use the reader to get the OuterXml like so:
using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(filepath, false))
{
WorkbookPart wbPart = spreadSheetDocument.WorkbookPart;
OpenXmlReader reader = OpenXmlReader.Create(wbPart);
while (reader.Read())
{
if (reader.ElementType == typeof(Sheet))
{
Sheet sheet = (Sheet)reader.LoadCurrentElement();
WorksheetPart wsPart = (WorksheetPart)(wbPart.GetPartById(sheet.Id));
OpenXmlReader wsReader = OpenXmlReader.Create(wsPart);
while (wsReader.Read())
{
if(wsReader.ElementType == typeof(Worksheet))
{
Worksheet wsPartXml = (Worksheet)wsReader.LoadCurrentElement();
Console.WriteLine(wsPartXml.OuterXml + "\n");
}
}
}
}
Console.ReadKey();
}
Related
I mostly write number-crunching programs using Visual Studio C# (2019) where I am simply taking input data, calculating results and displaying it. No complicated Network or Internet programming. Think first or second college level programming coarse from the early 1990's.
For inputs I was reading in data from an excel file using the following directive:
using Excel = Microsoft.Office.Interop.Excel;
This proved to be very slow when executing the program. I then learned this way of accessing an Excel file is no longer supported and has been superseded by Open XML SDK. Please see the following link to the Microsoft Dev Center page:
https://learn.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet
For what I want to do the Document Object Model(DOM) approach seems most appropriate for the thousands of individual excel cells I want to read as input data. However, the Microsoft Dev Center is certainly not the most user-friendly resource and the code example provided for reading an Excel file using this DOM approach is writing to a console which I'm not using. I never did get my code to work.
Long and short of it is, I got my code working using the GetCellValue Method:
https://learn.microsoft.com/en-us/office/open-xml/how-to-retrieve-the-values-of-cells-in-a-spreadsheet
However, this 'GetCellValue' method is still taking way too long. I need to read in thousands or tens of thousands of Excel input data cells in seconds or fractions of seconds not 20 seconds to a minute.
I think if I had an example of the DOM method reading in Excel data to an Array Variable (instead of writing to the console) it would help. Can anyone provide an example of such code?
Below I have included my code example where I modified the DOM approach code copied from the Microsoft Office Dev Center to write values from a source Excel File to a DataGrid instead of the Console used by the Dev Center code:
C#
// The DOM approach.
// Note that the code below works only for cells that contain numeric values.
//
public void ReadExcelFileDOM(string fileName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
DataGridView_Vessel.Rows.Clear();
DataGridView_Vessel.Refresh();
string text;
int File_Row = 0;
int File_Cell = 0;
foreach (Row r in sheetData.Elements<Row>())
{
DataGridView_Vessel.Rows.Add();
foreach (Cell c in r.Elements<Cell>())
{
if (c.CellValue == null)
{
File_Cell++;
//continue;
}
else
{
text = c.CellValue.Text;
if(File_Cell<12)
{
DataGridView_Vessel.Rows[File_Row].Cells[File_Cell].Value = text;
}
File_Cell++;
}
}
File_Row++;
}
//Console.WriteLine();
//Console.ReadKey();
}
}
I am trying to automate a Powerpoint presentation. I am using OpenXML to navigate the powerpoint presentation up to the point that I find the Excel linked to a chart. Now I want to use EPPlus to load a datatable into one of the worksheets (because EPPlus has a simple LoadFromDataTable function whereas I think I would have to write lots of code to use OpenXML).
So my problem is this.
I have a PresentationDocument in memory. And I have navigated to the particular chart that I want to manipulate via:
doc.PresentationPart.SlideParts.ElementAt(0).ChartParts.ElementAt(0)
I get the Excel part using:
var stream = chartpart.EmbeddedPackagePart.GetStream()
Then I tried:
using(var pck = new ExcelPackage(stream)) {
`do stuff;
`pck.Save();
}
and then at the end I do a doc.PresentationPart.Presentation.Save but this hasn't changed the Presentation. I can change it using OpenXML instead with:
using (var xl = Spreadsheet.Document.Open(stream, true))
{
`do stuff;
`xl.Close();
}
With everything else the same. So I guess either xl.Close() is doing stuff that pck.Save() isn't or I am using the stream incorrectly - can anyone advise?
I need some way to read worksheets from an Excel file, then select what it's important, and put all the Data in a Database.
It's a job for one time only. It's an old excel file with information, that have to be passed to a Database, and which will work with an application that i have developed.
I saw many examples but i am not understanding well how it works, and which way is the best to do this type of work.
My idea is to develop some application in c# that to this process.
LINQ may work and would be preferable for a relatively simple table.
A traditional approach would be to use COM Interop. Here's a Microsoft page about to work with Excel using COM interop (it is for spreadsheet creation, but the principles are the same - just use different API methods to open and read data):
http://msdn.microsoft.com/en-us/library/ms173186(v=vs.80).aspx
I strongly suggest using a linq provider to connect to excel. This should make it very easy to query for the information you are looking for. Once you have it, inserting into the database should be easy.
http://code.google.com/p/linqtoexcel/
I've used this before and it works well.
http://exceldatareader.codeplex.com/
had an example handy...
using (FileStream fileStream = File.Open(inputFilenames[0], FileMode.Open, FileAccess.Read))
{
IExcelDataReader excelReader;
if (Path.GetExtension(inputFilenames[0]) == ".xls")
excelReader = Factory.CreateReader(fileStream, ExcelFileType.Binary);
else
excelReader = Factory.CreateReader(fileStream, ExcelFileType.OpenXml);
excelReader.NextResult();
while (excelReader.Name != this.Worksheet)
excelReader.NextResult();
while (excelReader.Read())
{
if (FirstRowHasColumnNames)
{
FirstRowHasColumnNames = false;
}
else
{
//do stuff
var test = GetColumnData(excelReader, 1);
}
}
this.Save(outputFilename);
}
Read Data from Excel Worksheet in c#? I got the Exact solution here visit
http://microsoftdotnetsolutions.blogspot.in/2012/12/get-excel-sheet-data.html
In .NET C# I'm trying to open an Excel template, add some data and save it as a new document. I'm trying to use the OpenXML document format. I can't seem to find any guidance on how to do this. Seems like all the documentation talks about how to write various parts to the Package but I can't find anything on what to do when you're done and want to save it.
Anyone know where I can find this information? I must be thinking about this incorrectly because I'm not finding anything useful on what seems to be very basic.
Thanks
ExcelPackage works pretty good for that. It hasn't been worked on by the primary author I dont think for a little while but it has a good following of people on its forum that work any issues out.
FileInfo template = new FileInfo(Path.GetDirectoryName(Application.ExecutablePath)+"\\Template.xlsx");
try
{
using (ExcelPackage xlPackage = new ExcelPackage(strFileName,template))
{
//Enable DEBUG mode to create the xl folder (equlivant to expanding a xlsx.zip file)
//xlPackage.DebugMode = true;
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets["Sheet1"];
worksheet.Name = WorkSheetName;
foreach (DataRow row in dt.Rows)
{
int c = 1;
if (r > startRow) worksheet.InsertRow(r);
// our query has the columns in the right order, so simply
// iterate through the columns
foreach (DataColumn col in dt.Columns)
{
if (row[col].ToString() != null)
{
worksheet.Cell(r, c).Value = colValue;
worksheet.Column(c).Width = 10;
}
c++;
}
r++;
}
// change the sheet view to show it in page layout mode
worksheet.View.PageLayoutView = false;
// save our new workbook and we are done!
xlPackage.Save();
xlPackage.Dispose();
}
}
Accessing Open XML / SpreadsheetML documents is far from a trivial exercise. The specification is large and complex. The "Open XML SDK" (google it) definitely helps, but still requires some knowledge of the Open XML standard to get much done.
SpreadsheetGear for .NET has an API similar to Excel and can read and write Excel Open XML (xlsx) documents as well as Excel 97-2003 (xls) documents.
You can see some SpreadsheetGear samples here and download a free trial here.
Disclaimer: I own SpreadsheetGear LLC
I need a way to read a Excel file from a stream. It doesn't seem to work with the ADO.NET way of doing things.
The scenario is that a user uploads a file through a FileUpload and i need to read some values from the file and import to a database.
For several reasons I can't save the file to disk, and there is no reason to do so either.
So, anyone know of a way to read a Excel file from a FileUpload stream?
It seems i found a soultion to the problem myself.
http://www.codeplex.com/ExcelDataReader
This library seems to work nicely and it takes a stream to read the excel file.
ExcelDataReader reader = new ExcelDataReader(ExcelFileUpload.PostedFile.InputStream);
This can be done easily with EPPlus.
//the excel sheet as byte array (as example from a FileUpload Control)
byte[] bin = FileUpload1.FileBytes;
//gen the byte array into the memorystream
using (MemoryStream ms = new MemoryStream(bin))
using (ExcelPackage package = new ExcelPackage(ms))
{
//get the first sheet from the excel file
ExcelWorksheet sheet = package.Workbook.Worksheets[1];
//loop all rows in the sheet
for (int i = sheet.Dimension.Start.Row; i <= sheet.Dimension.End.Row; i++)
{
//loop all columns in a row
for (int j = sheet.Dimension.Start.Column; j <= sheet.Dimension.End.Column; j++)
{
//do something with the current cell value
string currentCellValue = sheet.Cells[i, j].Value.ToString();
}
}
}
SpreadsheetGear can do it:
SpreadsheetGear.IWorkbook workbook = SpreadsheetGear.Factory.GetWorkbookSet().Workbooks.OpenFromStream(stream);
You can try it for yourself with the free evaluation.
Disclaimer: I own SpreadsheetGear LLC
Infragistics has an excel component that can read an excel file from a stream.
I'm using it in a project here and it works well.
Also the open source myXls component could easily be modified to support this. The XlsDocument contstructor only supports loading from a file given by a file name, but it works by creating a FileStream and then reading the Stream, so changing it to support loading from streams should be trivial.
Edit:
I see that you found a solution but I just wanted to note that I updated the source code for the component so that it now can read an excel file directly from a stream. :-)
I use ClosedXML nuget package to read excel content from stream. It has a constructor overload in XLWorkbook class which takes stream pointing to an excel file (aka workbook).
imported namespace at the top of your code file:
using ClosedXML.Excel;
Source code:
var stream = /*obtain the stream from your source*/;
if (stream.Length != 0)
{
//handle the stream here
using (XLWorkbook excelWorkbook = new XLWorkbook(stream))
{
var name = excelWorkbook.Worksheet(1).Name;
//do more things whatever you like as you now have a handle to the entire workbook.
var firstRow = excelWorkbook.Worksheet(1).Row(1);
}
}