Saving HTML from MemoryStream into a Excel file - c#

I have a XSLT transformed HTML data in MemoryStream (in C#). I am trying to convert this to an Excel format before emailing, preferably conversion happens all in memory again without saving to local disk. I can worry about the email attachment part later. Can anyone point me to a sample on how I could do the conversion from HTML to Excel format either through OpenXML or with Office.Interop.Excel.
The HTML data is well formed and I could manually do the conversion by opening the html in Excel application and do a Save As to save it in xlsx format (Office 2010), no problem. I also tried to simply change the .html extension to .xlsx, but then excel complains about opening it.
What's the best way to automate the manual SaveAs action so that I could use the same html data in Excel format? I understand that I could create a separate .xslt for directly converting my XML into Excel format. But, that'll be too many .xslt to maintain. I'm trying to find the hack to let Excel do the work for me.
Thank you for any and all pointers in advance!
EDIT:
I figured I have no choice but to store html to disk and read it back and use Excel Interop to do SaveAs method. When I did try though, getting the exception with HRESULT: 0x800A03EC on the SaveAs method. Here's how to reproduce it.
steps to reproduce the behavior
Save this text
<html><head></head><body><center><h1>Test Header</h1></center></body></html>
as C:\Test.html
after making reference to Excel interop like this,
using Excel = Microsoft.Office.Interop.Excel;
Try this code
`
var app = new Excel.Application();
Excel.Workbook wb = null;
try
{
wb = app.Workbooks.Open(#"c:\test.html");
wb.SaveAs(#"c:\test.xlsx", Excel.XlFileFormat.xlOpenDocumentSpreadsheet);
//wb.SaveCopyAs(#"c:\test.xlsx");
wb.Close();
}
catch (Exception ex)
{
//_logger.Error(ex);
}
finally
{
app.Quit();
}
`
I always get the mentioned exception on SaveAs no matter which fileformat I choose or even not mentioning the fileformat there..
Any ideas?

This code works. It turns out the exception I was getting is only related to the file format I was trying to save. When I changed it to Open XML workbook, it saved fine.
using Excel = Microsoft.Office.Interop.Excel;
.
.
.
var app = new Excel.Application();
Excel.Workbook wb = null;
try
{
wb = app.Workbooks.Open(#"c:\test.html");
wb.SaveAs(#"c:\test.xlsx", Excel.XlFileFormat.xlOpenXMLWorkbook);
//wb.SaveCopyAs(#"c:\test.xlsx");
wb.Close();
}
catch (Exception ex)
{
//_logger.Error(ex);
}
finally
{
app.Quit();
}

Here's the updated code that takes bytes[] html as input and returns xlsx in bytes[]
public static byte[] DoConvertXlDataToOpenXml(byte[] data, FileInfo fileInfo)
{
ExcelInterop.Application excelApp = null;
ExcelInterop.Workbooks workBooks = null;
ExcelInterop.Workbook workBook = null;
FileInfo tempFile = null;
FileInfo convertedTempFile = null;
try
{
//Stream the file to temporary location, overwrite if exists
tempFile = new FileInfo(Path.ChangeExtension(Path.Combine(Path.GetTempFileName()), fileInfo.Extension));
using (var destStream = new FileStream(tempFile.FullName, FileMode.Create, FileAccess.Write))
{
destStream.Write(data, 0, data.Length);
}
//open original
excelApp = new ExcelInterop.Application();
excelApp.Visible = false;
excelApp.DisplayAlerts = false;
workBooks = excelApp.Workbooks;
workBook = workBooks.Open(tempFile.FullName);
convertedTempFile = new FileInfo(Path.ChangeExtension(Path.GetTempFileName(), "XLSX"));
//Save as XLSX
excelApp.Application.ActiveWorkbook.SaveAs(
convertedTempFile.FullName
, Microsoft.Office.Interop.Excel.XlFileFormat.xlOpenXMLWorkbook
, ConflictResolution: ExcelInterop.XlSaveConflictResolution.xlLocalSessionChanges);
excelApp.Application.ActiveWorkbook.Close();
return File.ReadAllBytes(convertedTempFile.FullName);
}
catch (Exception)
{
throw;
}
finally
{
if (workBooks != null)
Marshal.ReleaseComObject(workBooks);
if (workBook != null)
Marshal.ReleaseComObject(workBook);
if (excelApp != null)
Marshal.ReleaseComObject(excelApp);
if (tempFile != null && tempFile.Exists)
tempFile.Delete();
if (convertedTempFile != null && convertedTempFile.Exists)
{
convertedTempFile.Delete();
}
}
}

Related

How to remove read-only in a new copy using Open XML SDK

I have come across two spreadsheets giving me errors when using Open XML SDK to convert.
The cases are:
Read-only password protection (I don't have password)
Filesharing enabled (another user on the network has the spreadsheet open and spreadsheet is read-only until user closes spreadsheet)
If I use Excel Interop, it is possible to give parameters that will open a copy of the spreadsheet and enable write permissions and hence any programmatic conversion process can continue. This code enables the behaviour by utilising IgnoreReadOnlyRecommended
// Convert legacy Excel files to .xlsx Transitional using Microsoft Office Interop Excel
public bool Convert_Legacy_ExcelInterop(string input_filepath, string output_filepath)
{
bool convert_success = false;
// Open Excel
Excel.Application app = new Excel.Application(); // Create Excel object instance
app.DisplayAlerts = false; // Don't display any Excel prompts
Excel.Workbook wb = app.Workbooks.Open(input_filepath, ReadOnly: false, Password: "'", WriteResPassword: "'", IgnoreReadOnlyRecommended: true, Notify: false); // Create workbook instance
// Save workbook as .xlsx Transitional and close Excel
wb.SaveAs(output_filepath, 51);
wb.Close();
app.Quit();
return convert_success = true;
}
How can I imitate the same behaviour using Open XML SDK?
Here's my code:
// Convert to .xlsx Transitional
public bool Convert_to_OOXML_Transitional(string input_filepath, string output_filepath)
{
bool convert_success = false;
// If write-protected or reserved by another user
using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(input_filepath, false))
{
if (spreadsheet.WorkbookPart.Workbook.WorkbookProtection != null || spreadsheet.WorkbookPart.Workbook.FileSharing != null)
{
// Use Excel Interop to convert the spreadsheet
Convert_Legacy_ExcelInterop(input_filepath, output_filepath);
return convert_success = true;
// REPLACE ABOVE CODE WITH SOMETHING NATIVE TO OPEN XML SDK
}
}
// Convert spreadsheet
byte[] byteArray = File.ReadAllBytes(input_filepath);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(stream, true))
{
spreadsheet.ChangeDocumentType(SpreadsheetDocumentType.Workbook);
}
File.WriteAllBytes(output_filepath, stream.ToArray());
}
// Repair spreadsheet
Repair rep = new Repair();
rep.Repair_OOXML(output_filepath);
// Return success
convert_success = true;
return convert_success;
}

Export information to an xlsm file, C# VS asp.net

is there a way to export information to an xlsm file? the steps I do is:
in a button I put an input to select the file, I upload the file to the server
I look for the sheet which is already specified in the code
I modify the file information according to the information to be exported
command to save the file locally.
the error is as follows:
{"The 'br' start tag on line 59 position 30 does not match the end tag of 'font'. Line 60, position 9."}
when indicating the sheet with which to work
I share my code: any suggestions?
public void ExportFile(string FileName, string UserID)
{
FileInfo fi = new FileInfo(FileName);
Master.MSGError = string.Empty;
string SheetName = "test";
using (MemoryStream file = new MemoryStream())
{
try
{
using (ExcelPackage xlPackage = new ExcelPackage(fi))
{
ExcelWorksheet worksheet;
worksheet = xlPackage.Workbook.Worksheets[SheetName]; //here is the error exception
worksheet.Cells[1, 1].Value = "TEST";
//save file
xlPackage.SaveAs(file);
Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Response.BinaryWrite(file.ToArray());
Response.Flush();
Response.End();
}
}
catch (Exception ex)
{
Master.fc.MSGError = ex.Message;
}
}
}
Currently I solved my problem I thought that the detail was in the macro, but I found the real error doing different tests, it seems that both epplus and closedxml have problems reading certain information in the excel, I ended up using closedxml and applying the solution:
OpenXml Excel: throw error in any word after mail address
I'm sorry for confusion

Export Excel To PDF excludign hidden tabs

I am trying to export Excel files to PDF. I am having success with this using the Microsoft.Office.Interop namespace. I am now trying to find out how to exclude tabs that are marked hidden, so that they are not within the PDF> Hase anyone done this or knows how to do this? My code is shown below that I am currently using.
string inFile = #"C:\Users\casey.pharr\Desktop\testPDF\3364850336.xls";
string outFile = #"C:\Users\casey.pharr\Desktop\testPDF\3364850336_noHidden_out.pdf";
string tempFile = #"C:\Users\casey.pharr\Desktop\testPDF\temp.xls";
try
{
//first copy original file to temp file to work with
File.Copy(inFile,tempFile, true);
Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
app.Visible = false;
app.DisplayAlerts = false;
Microsoft.Office.Interop.Excel.Workbook wkb = app.Workbooks.Open(tempFile);
for(int x = app.Sheets.Count-1; x-1 > 1; x--)
{
Excel._Worksheet sheet = (Excel._Worksheet)app.Sheets[x];
//now delete hidden worksheets from work book. This is why we are using tempFile
if (sheet.Visible == Microsoft.Office.Interop.Excel.XlSheetVisibility.xlSheetHidden || sheet.Visible == Microsoft.Office.Interop.Excel.XlSheetVisibility.xlSheetVeryHidden && sheet != null)
{
//is sheet hidden. If so remove it so not part of converted file
sheet.Delete();
}
}
wkb.ExportAsFixedFormat(Microsoft.Office.Interop.Excel.XlFixedFormatType.xlTypePDF, outFile);
wkb.Close(false);
app.Quit();
//return outputLocation;
The error that occurs on calling .Delete() is below:
Exception from HRESULT: 0x800A03EC
enter code here
So we can convert the pdf's fine, but not remove or exclude hidden worksheets. I went the route to try to delete them then convert the entire file, but not working.

Updating existing excel file while it is open

I read a lot about how to communicate from C# to Excel and saw some good references.
The thing is I'm looking for an easy way to update existing excel file while it is still open, using
the most advanced way (linq for example) and not OLEDB.
This should be few lines of code describing how can I read current cell, update his value and take into consideration the file might not be exist, but if it does exist and open, it will just update the file without giving the notification the file is already exist. If the file doesn't exist it will create a new one.
SO:
1. connect to an excel file, check if it exist, if not create one
2. read from cell
3. update cell
4. do this while the excel sheet can be still open wild.
I already visited the following places:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/ef11a193-54f3-407b-9374-9f5770fd9fd7/writing-to-excel-using-c
Updating an excel document programmatically
Update specific cell of excel file using oledb
I used the following code:
if (File.Exists(#"C:\\temp\\test.xls"))
{
Microsoft.Office.Interop.Excel.Application excelApp = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbooks workBooks = excelApp.Workbooks;
Microsoft.Office.Interop.Excel.Workbook workBook = workBooks.Open(#"C:\\temp\\test.xls");
Microsoft.Office.Interop.Excel.Worksheet workSheet = workBook.Worksheets.get_Item(1);
int nColumns = workSheet.UsedRange.Columns.Count;
int nRows = workSheet.UsedRange.Rows.Count;
for (int i = 2; i < nRows; i++)
{
workSheet.Columns["1","A"] = "test";
}
workBook.Save();
workBook.Close();
}
So I use VSTO Contrib to help out with COM Interop and memory management and that's why you see .WithComCleanup().
To open up a spreadsheet:
try
{
using (var xlApp = new Microsoft.Office.Interop.Excel.Application().WithComCleanup())
using (var wrkbooks = xlApp.Resource.Workbooks.WithComCleanup())
using (var wrkbook = wrkbooks.Resource.Open(filePath, false, true).WithComCleanup())
{
If the excel file is already open, then to get around the Read-Only follow this tip:
wrkbooks.Resource.Open(filePath, false, FALSE).WithComCleanup())
Here's how I iterate though the sheets (note that some Excel sheets are ChartSheets):
foreach (object possibleSheet in xlApp.Resource.Sheets)
{
Microsoft.Office.Interop.Excel.Worksheet aSheet = possibleSheet as Microsoft.Office.Interop.Excel.Worksheet;
if (aSheet == null)
continue;
Here is a quick way to get a reference to the sheet you're interested in:
activeSheet = wrkbook.Resource.Sheets[sheetToImport];
You read and write to cells just as you've identified:
for (int i = 2; i < nRows; i++)
{
activeSheet.Columns["1","A"] = "test";
}
Here is how I close Excel:
MathematicaAPI.XlHelper.CloseExcel((Worksheet)activeSheet, (Workbook)wrkbook.Resource , (Workbooks)wrkbooks.Resource);
public static void CloseExcel(Worksheet activeSheet, Workbook wrkbook, Workbooks wrkbooks)
{
//http://support.microsoft.com/kb/317109 -> excel just wont close for some reason
if (activeSheet != null)
{
Marshal.FinalReleaseComObject(activeSheet);
activeSheet = null;
}
if (wrkbook != null)
{
wrkbook.Saved = true;
wrkbook.Close(Microsoft.Office.Interop.Excel.XlSaveAction.xlDoNotSaveChanges);
}
if (wrkbooks != null)
{
wrkbooks.Close();
}
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
}
Sometimes Excel just wont close and you have to kill it (after trying to close it properly of course) - I dont recommend this, but if you cant track down the un-disposed memory and all else fails then...
if (xlApp != null)
{
ExcelDataSourceHelper.GetWindowThreadProcessId(new IntPtr(xlApp.Resource.Hwnd), ref excelProcessId);
}
if (excelProcessId > 0)
{
XlHelper.KillProcess(excelProcessId);
}
public static void KillProcess(int excelProcessId)
{
if (excelProcessId > 0)
{
System.Diagnostics.Process ExcelProc = null;
try
{
ExcelProc = System.Diagnostics.Process.GetProcessById(excelProcessId);
if (ExcelProc != null)
{
ExcelProc.Kill();
}
}
catch
{ }
}
}
Note: I reduce the chances of needing to kill Excel by using VSTO Contrib with Using's.
OK thank you all for trying to solve the issue
The solution was using Excel 2011/2013 Add-In which can communicate excel as a plugin
create an application-level add-in for Microsoft Office Excel. The features that you create in this kind of solution are available to the application itself, regardless of which workbooks are open.
You can visit MSDN

Codeplex Excel Data Reader give empty Data set for Excel 2010

I’m using Codeplex Excel Data Reader to read an excel. The problem that I face is It reads Excel 97-2003 documents without any difficulty, but when reading Excel 207-2010 documents using ExcelReaderFactory.CreateOpenXmlReader(stream), it output’s an empty data set. Did anyone faced this problem. And is any one has any solution for this?
The read method is as follows
private DataSet ReadExcel(string fileName, string extention)
{
DataSet dsData = null;
FileStream stream = File.Open(fileName, FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = null;
try
{
if (extention.Equals("xls"))
{
//1. Reading from a binary Excel file ('97-2003 format; *.xls)
excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
}
else
{
//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
// excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
}
excelReader.IsFirstRowAsColumnNames = false;
dsData = excelReader.AsDataSet();
}
catch (Exception ex)
{
throw ex;
}
finally
{
if (excelReader != null)
{
excelReader.Close();
}
}
return dsData;
}
8000401a indicates it was something to do with Run As Logon Failure.
Steer clear of server-side automation of office. Or use XML to work with Excel spreadsheets on the server.
According to the support issues with the Excel Data Reader:
Design and usage are great. So far only issue I've had is with certain
XLSX file not parsing correctly (reading in wrong sheets, missind cell
values, etc). To resolve these issues, I had to rebuild Excel.dll
using latest SharpZipLib from
http://www.icsharpcode.net/OpenSource/SharpZipLib/Download.aspx. As
others have said, project needs an update, but is still good.
Or just use the standard micrsoft way:
Microsoft.Office.Interop.Excel.Application xlApp;
Workbook wb = null;
try
{
wb = xlApp.Workbooks.Open(filePath, false, true,5,null,"WrongPAssword");
}
foreach (object possibleSheet in wb.Sheets)
{
var aSheet = possibleSheet as Worksheet;
if (aSheet != null)
{
....

Categories

Resources