I'm using the EPPLUS library to read data from Excel to create another file. Unfortunately it does not support the .XLSM extension file. Is there a nice way to convert .XLSM files to .XLSX file for the purpose of reading the file with EPPLUS?
(using EPPLUS for reading would be nice because all my code is already written using it :) )
In order to do this you will need to use the Open XML SDK 2.0. Below is a snippet of code that worked for me when I tried it:
byte[] byteArray = File.ReadAllBytes("C:\\temp\\test.xlsm");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (SpreadsheetDocument spreadsheetDoc = SpreadsheetDocument.Open(stream, true))
{
// Change from template type to workbook type
spreadsheetDoc.ChangeDocumentType(SpreadsheetDocumentType.Workbook);
}
File.WriteAllBytes("C:\\temp\\test.xlsx", stream.ToArray());
}
What this code does is it takes your macro enabled workbook file and opens it into a SpreadsheetDocument object. The type of this object is MacroEnabledWorkbook, but since you want it as a Workbook you call the ChangeDocumentType method to change it from a MacroEnabledWorkbook to a Workbook. This will work since the underlying XML is the same between a .xlsm and a .xlsx file.
Using the Open XML SDK, like in amurra's answer, but
in addition to changing doc type, VbaDataPart and VbaProjectPart should be removed, otherwise Excel will show error a file is corrupted.
using (var inputStream = File.OpenRead("C:\\temp\\test.xlsm"))
using (var outStream = new MemoryStream()) {
inputStream.CopyTo(outStream);
using (var doc = SpreadsheetDocument.Open(outStream, true)) {
doc.DeletePartsRecursivelyOfType<VbaDataPart>();
doc.DeletePartsRecursivelyOfType<VbaProjectPart>();
doc.ChangeDocumentType(DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook);
}
File.WriteAllBytes("C:\\temp\\test.xlsx", outStream.ToArray());
}
package xlsbtoxlsx;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.openxml4j.opc.PackageRelationship;
import org.apache.poi.openxml4j.opc.PackageRelationshipCollection;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbookType;
public class XlsbToXlsxConvertor {
public static void main(String[] args) throws Exception {
String inputpath="C:\\Excel Data Files\\XLSB\\CSD_TDR_20200823";
String outputpath="C:\\Excel Data Files\\XLSB\\output";
new XlsbToXlsxConvertor().xlsmToxlsxFileConvertor(inputpath, outputpath);
}
public void xlsmToxlsxFileConvertor(String inputpath, String outputpath) throws Exception {
XSSFWorkbook workbook;
FileOutputStream out;
System.out.println("inputpath " + inputpath);
File directoryPath = new File(inputpath);
// List of all files and directories
String contents[] = directoryPath.list();
System.out.println("List of files and directories in the specified directory:");
for (int i = 0; i < contents.length; i++) {
System.out.println(contents[i]);
// create workbook from XLSM template
workbook = (XSSFWorkbook) WorkbookFactory
.create(new FileInputStream(inputpath + File.separator + contents[i]));
// save copy as XLSX ----------------START
OPCPackage opcpackage = workbook.getPackage();
// get and remove the vbaProject.bin part from the package
PackagePart vbapart = opcpackage.getPartsByName(Pattern.compile("/xl/vbaProject.bin")).get(0);
opcpackage.removePart(vbapart);
// get and remove the relationship to the removed vbaProject.bin part from the
// package
PackagePart wbpart = workbook.getPackagePart();
PackageRelationshipCollection wbrelcollection = wbpart
.getRelationshipsByType("http://schemas.microsoft.com/office/2006/relationships/vbaProject");
for (PackageRelationship relship : wbrelcollection) {
wbpart.removeRelationship(relship.getId());
}
// set content type to XLSX
workbook.setWorkbookType(XSSFWorkbookType.XLSX);
// write out the XLSX
out = new FileOutputStream(outputpath + File.separator + contents[i].replace(".xlsm", "") + ".xlsx");
workbook.write(out);
out.close();
System.out.println("done");
workbook.close();
}
}
}
Related
i'am trying to get an xls file from an ZipArchive but cant get it with EPPLUS
foreach (ZipArchiveEntry entry in archive.Entries)
{
if (entry != null)
{
string filepath = entry.FullName;
FileInfo fileInfo = new FileInfo(filepath);
//here i got the excel package with the xls file inside the excelPackage
using (ExcelPackage excelPackage = new ExcelPackage(fileInfo))
{
//but here impossible de get the worksheet or workbook inside or anything else
ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets.FirstOrDefault();
int totalColomn = worksheet.Dimension.End.Column;
int nbrsheet = excelPackage.Workbook.Worksheets.Count();
}
}
}
the ExcelPackage i get in debug
i see the xls file on debug inside the excelpackage but just when i try to get worksheet it exit without exception code....
same here when trying with entryStream
using (var entryStream = entry.Open())
{
//Cant even get the excelpackage, it crash here without exception
using (ExcelPackage excelPackage = new ExcelPackage(entryStream))
{
ExcelWorksheet worksheetest = excelPackage.Workbook.Worksheets.FirstOrDefault();
}
}
the stream here seem also strange ...
entryStream Debug
Working with .NET CORE Blazor ServerSide, ePPLUS 4.5
Thanks for helping
entry.FullName refers to the full path to the file inside the zip archive, while FileInfo describes a file in the filesystem of the OS, which is a completely different thing. You haven't extracted anything to the OS filesystem yet, so the FileInfo won't refer to a file that actually exists.
Try the ExcelPackage constructor that takes a Stream, which you can get directly from a ZipArchiveEntry:
using (var entryStream = entry.Open())
{
using (ExcelPackage excelPackage = new ExcelPackage(entryStream))
{
// ...
}
}
I find the problem.
it was that i tried to get an xls file and the epplus library dont work with it...
you have to be careful, EPplus dont work with xls file
So , your solution Jeff is working, it was my fault, didn't specified the extension of my excel file... sorry
-> EPlus with an .xlsx OK, not .xls
My bad.
Thanks anyway :-)
After exporting data into an Excel workbook with macros (xlsm), I run the macro and then remove the macro in order to be able to save the workbook as xlsx. For removing macros, I open the xlsm as zip archive (via C# ZipFile class), remove the entry "xl/vbaProject.bin" and remove a relation within "xl/_rels/workbook.xml.rels". Then I rename the file from xlsm to xlsx. That works fine so far but when I open the xlsx file in Excel, I get "Excel cannot open the file because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file", so there seems something to be missing in order to completely remove the vba code within the workbook. Can anyone help me here?
const string vbaProjectEntryName = "xl/vbaProject.bin"; // Contains the VBA code
const string relationsEntryName = "xl/_rels/workbook.xml.rels"; // Relation/Link to the vba project
using (var zip = ZipFile.Open(fileName, ZipArchiveMode.Update))
{
var entry = zip.GetEntry(vbaProjectEntryName);
if (entry != null)
{
entry.Delete();
entry = zip.GetEntry(relationsEntryName);
if (entry != null)
{
var contents = string.Empty;
using (var streamReader = new StreamReader(entry.Open()))
{
contents = streamReader.ReadToEnd();
}
var relationText = "<Relationship Id=\"rId6\" Type=\"http://schemas.microsoft.com/office/2006/relationships/vbaProject\" Target=\"vbaProject.bin\"/>";
contents = contents.Replace(relationText, string.Empty);
entry.Delete();
entry = zip.CreateEntry(relationsEntryName);
using (var streamWriter = new StreamWriter(entry.Open()))
{
streamWriter.Write(contents);
}
}
}
}
I am working with ClosedXML utility for excel operations. It only supports files created on office versions 2007 on-words. I have an xls file and required to convert to xlms(macro enabled). Simple copy as shown below is not working.
string oldFile = #"C:\Files\xls.xls";
string newFile = #"C:\Files\xlsnew.xlsm";
File.Copy(xls_file,xlsm_file);
And I have used below code also
byte[] byteArray = File.ReadAllBytes(oldFile);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (SpreadsheetDocument spreadsheetDoc = SpreadsheetDocument.Open(stream, true))
{
// Change from template type to workbook type
spreadsheetDoc.ChangeDocumentType(SpreadsheetDocumentType.MacroEnabledWorkbook);
}
File.WriteAllBytes(newFile, stream.ToArray());
}
Please provide a helpful solution.
Thanks in advance.
I am using OpenXML SDK.
The OpenXML SDK creates a method called CreatePackage as such:
public void CreatePackage(string filePath)
{
using (SpreadsheetDocument package = SpreadsheetDocument.Create(filePath, SpreadsheetDocumentType.Workbook))
{
CreateParts(package);
}
}
I call it from my program as follows which will create the Excel file to a given path:
gc.CreatePackage(excelFilePath);
Process.Start(_excelFilePath);
I am not sure how to tweak the code such that it gives back a Stream which shows the Excel file vs having it create the file on disk.
According to the documentation for SpreadsheetDocument.Create there are multiple overloads, one of which takes a Stream.
so change your code to:
public void CreatePackage(Stream stream)
{
using (SpreadsheetDocument package = SpreadsheetDocument.Create(stream, SpreadsheetDocumentType.Workbook))
{
CreateParts(package);
}
}
And then call it with any valid Stream, for example:
using(var memoryStream = new MemoryStream())
{
CreatePackage(memoryStream);
// do something with memoryStream
}
I can successfully inject a piece of VBA code into a generated excel workbook, but what I am trying to do is use the Workbook_Open() event so the VBA code executes when the file opens. I am adding the sub to the "ThisWorkbook" object in my xlsm template file. I then use the openxml productivity tool to reflect the code and get the encoded VBA data.
When the file is generated and I view the VBA, I see "ThisWorkbook" and "ThisWorkbook1" objects. My VBA is in "ThisWorkbook" object but the code never executes on open. If I move my VBA code to "ThisWorkbook1" and re-open the file, it works fine. Why is an extra "ThisWorkbook" created? Is it not possible to inject an excel spreadsheet with a Workbook_Open() sub? Here is a snippet of the C# code I am using:
private string partData = "..."; //base 64 encoded data from reflection code
//open workbook, myWorkbook
VbaProjectPart newPart = myWorkbook.WorkbookPart.AddNewPart<VbaProjectPart>("rId1");
System.IO.Stream data = GetBinaryDataStream(partData);
newPart.FeedData(data);
data.Close();
//save and close workbook
Anyone have ideas?
Based on my research there isn't a way to insert the project part data in a format that you can manipulate in C#. In the OpenXML format, the VBA project is still stored in a binary format. However, copying the VbaProjectPart from one Excel document into another should work. As a result, you'd have to determine what you wanted the project part to say in advance.
If you are OK with this, then you can add the following code to a template Excel file in the 'ThisWorkbook' Microsoft Excel Object, along with the appropriate Macro code:
Private Sub Workbook_Open()
Run "Module1.SomeMacroName()"
End Sub
To copy the VbaProjectPart object from one file to the other, you would use code like this:
public static void InsertVbaPart()
{
using(SpreadsheetDocument ssDoc = SpreadsheetDocument.Open("file1.xlsm", false))
{
WorkbookPart wbPart = ssDoc.WorkbookPart;
MemoryStream ms;
CopyStream(ssDoc.WorkbookPart.VbaProjectPart.GetStream(), ms);
using(SpreadsheetDocument ssDoc2 = SpreadsheetDocument.Open("file2.xlsm", true))
{
Stream stream = ssDoc2.WorkbookPart.VbaProjectPart.GetStream();
ms.WriteTo(stream);
}
}
}
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[short.MaxValue + 1];
while (true)
{
int read = input.Read(buffer, 0, buffer.Length);
if (read <= 0)
return;
output.Write(buffer, 0, read);
}
}
Hope that helps.
I found that the other answers still resulted in the duplicate "Worksheet" object. I used a similar solution to what #ZlotaMoneta said, but with a different syntax found here:
List<VbaProjectPart> newParts = new List<VbaProjectPart>();
using (var originalDocument = SpreadsheetDocument.Open("file1.xlsm"), false))
{
newParts = originalDocument.WorkbookPart.GetPartsOfType<VbaProjectPart>().ToList();
using (var document = SpreadsheetDocument.Open("file2.xlsm", true))
{
document.WorkbookPart.DeleteParts(document.WorkbookPart.GetPartsOfType<VbaProjectPart>());
foreach (var part in newParts)
{
VbaProjectPart vbaProjectPart = document.WorkbookPart.AddNewPart<VbaProjectPart>();
using (Stream data = part.GetStream())
{
vbaProjectPart.FeedData(data);
}
}
//Note this prevents the duplicate worksheet issue
spreadsheetDocument.WorkbookPart.Workbook.WorkbookProperties.CodeName = "ThisWorkbook";
}
}
You need to specify "codeName" attribute in the "xl/workbook..xml" object
After feeding the VbaProjectPart with macro. Add this code:
var workbookPr = spreadsheetDocument.WorkbookPart.Workbook.Descendants<WorkbookProperties>().FirstOrDefault();
workbookPr.CodeName = "ThisWorkBook";
After opening the file everything should work now.
So, to add macro you need to:
Change document type to macro enabled
Add VbaProjectPart and feed it with earlier created macro
Add workbookPr codeName attr in xl/workbook..xml with value "ThisWorkBook"
Save as with .xlsm ext.