OpenXML SDK Inject VBA into excel workbook

OpenXML SDK Inject VBA into excel workbook - c#

I can successfully inject a piece of VBA code into a generated excel workbook, but what I am trying to do is use the Workbook_Open() event so the VBA code executes when the file opens. I am adding the sub to the "ThisWorkbook" object in my xlsm template file. I then use the openxml productivity tool to reflect the code and get the encoded VBA data.
When the file is generated and I view the VBA, I see "ThisWorkbook" and "ThisWorkbook1" objects. My VBA is in "ThisWorkbook" object but the code never executes on open. If I move my VBA code to "ThisWorkbook1" and re-open the file, it works fine. Why is an extra "ThisWorkbook" created? Is it not possible to inject an excel spreadsheet with a Workbook_Open() sub? Here is a snippet of the C# code I am using:
private string partData = "..."; //base 64 encoded data from reflection code
//open workbook, myWorkbook
VbaProjectPart newPart = myWorkbook.WorkbookPart.AddNewPart<VbaProjectPart>("rId1");
System.IO.Stream data = GetBinaryDataStream(partData);
newPart.FeedData(data);
data.Close();
//save and close workbook
Anyone have ideas?

Based on my research there isn't a way to insert the project part data in a format that you can manipulate in C#. In the OpenXML format, the VBA project is still stored in a binary format. However, copying the VbaProjectPart from one Excel document into another should work. As a result, you'd have to determine what you wanted the project part to say in advance.
If you are OK with this, then you can add the following code to a template Excel file in the 'ThisWorkbook' Microsoft Excel Object, along with the appropriate Macro code:
Private Sub Workbook_Open()
Run "Module1.SomeMacroName()"
End Sub
To copy the VbaProjectPart object from one file to the other, you would use code like this:
public static void InsertVbaPart()
{
using(SpreadsheetDocument ssDoc = SpreadsheetDocument.Open("file1.xlsm", false))
{
WorkbookPart wbPart = ssDoc.WorkbookPart;
MemoryStream ms;
CopyStream(ssDoc.WorkbookPart.VbaProjectPart.GetStream(), ms);
using(SpreadsheetDocument ssDoc2 = SpreadsheetDocument.Open("file2.xlsm", true))
{
Stream stream = ssDoc2.WorkbookPart.VbaProjectPart.GetStream();
ms.WriteTo(stream);
}
}
}
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[short.MaxValue + 1];
while (true)
{
int read = input.Read(buffer, 0, buffer.Length);
if (read <= 0)
return;
output.Write(buffer, 0, read);
}
}
Hope that helps.

I found that the other answers still resulted in the duplicate "Worksheet" object. I used a similar solution to what #ZlotaMoneta said, but with a different syntax found here:
List<VbaProjectPart> newParts = new List<VbaProjectPart>();
using (var originalDocument = SpreadsheetDocument.Open("file1.xlsm"), false))
{
newParts = originalDocument.WorkbookPart.GetPartsOfType<VbaProjectPart>().ToList();
using (var document = SpreadsheetDocument.Open("file2.xlsm", true))
{
document.WorkbookPart.DeleteParts(document.WorkbookPart.GetPartsOfType<VbaProjectPart>());
foreach (var part in newParts)
{
VbaProjectPart vbaProjectPart = document.WorkbookPart.AddNewPart<VbaProjectPart>();
using (Stream data = part.GetStream())
{
vbaProjectPart.FeedData(data);
}
}
//Note this prevents the duplicate worksheet issue
spreadsheetDocument.WorkbookPart.Workbook.WorkbookProperties.CodeName = "ThisWorkbook";
}
}

You need to specify "codeName" attribute in the "xl/workbook..xml" object
After feeding the VbaProjectPart with macro. Add this code:
var workbookPr = spreadsheetDocument.WorkbookPart.Workbook.Descendants<WorkbookProperties>().FirstOrDefault();
workbookPr.CodeName = "ThisWorkBook";
After opening the file everything should work now.
So, to add macro you need to:
Change document type to macro enabled
Add VbaProjectPart and feed it with earlier created macro
Add workbookPr codeName attr in xl/workbook..xml with value "ThisWorkBook"
Save as with .xlsm ext.

Related

Is there a way to format an Aspose.Cells generated excel sheet without calling Workbook.Save()?

My customer has a use case for exporting search results to a spreadsheet. I would like to return a formatted spreadsheet to them, but the only way I can get the formatting changes to "stick" is by calling
workbook.Save(memoryStream, SaveFormat.Xlsx);
The problem with calling the method above, is that a spreadsheet will actually be saved to my local project folder, which is not desired behavior. How can I return the spreadsheet without calling workbook.Save()?
public byte[] ExportSpreadsheet(List<Result> results)
{
var workbook = MakeWorkbook(results);
var memoryStream = new MemoryStream();
workbook.Save(memoryStream, SaveFormat.Xlsx); // this saves the spreadsheet in the project
memoryStream.Seek(0, SeekOrigin.Begin);
var byteArray = memoryStream.ToArray();
return byteArray;
}
private Workbook MakeWorkbook(List<Result> results)
{
var workbook = new Workbook();
AddDataToWorkbook(workbook);
ApplyFormattingAfterData(workbook);
return workbook;
}

workbook.Save(memoryStream, SaveFormat.Xlsx);
You are doing ok. This line will save the workbook to stream and not on physical filepath. It won't save to your project's folder or path.
PS. I am working as Support developer/ Evangelist at Aspose.

Replacing Invalid XML characters from an excel file and writing it back to disk causes file is corrupted error on opening in MS Excel

A little background on problem:
We have an ASP.NET MVC5 Application where we use FlexMonster to show the data in grid. The data source is a stored procedure that brings all the data into the UI grid, and once user clicks on export button, it exports the report to Excel. However, in some cases export to excel is failing.
Some of the data has some invalid characters, and it is not possible/feasible to fix the source as suggested here
My approach so far:
EPPlus library fails on initializing the workbook as the input excel file contains some invalid XML characters. I could find that the file is dumped with some invalid character in it. I looked into the possible approaches .
Firstly, I identified the problematic character in the excel file. I first tried to replace the invalid character with blank space manually using Notepad++ and the EPPlus could successfully read the file.
Now using the approaches given in other SO thread here and here, I replaced all possible occurrences of invalid chars. I am using at the moment
XmlConvert.IsXmlChar
method to find out the problematic XML character and replacing with blank space.
I created a sample program where I am trying to work on the problematic excel sheet.
//in main method
String readFile = File.ReadAllText(filePath);
string content = RemoveInvalidXmlChars(readFile);
File.WriteAllText(filePath, content);
//removal of invalid characters
static string RemoveInvalidXmlChars(string inputText)
{
StringBuilder withoutInvalidXmlCharsBuilder = new StringBuilder();
int firstOccurenceOfRealData = inputText.IndexOf("<t>");
int lastOccurenceOfRealData = inputText.LastIndexOf("</t>");
if (firstOccurenceOfRealData < 0 ||
lastOccurenceOfRealData < 0 ||
firstOccurenceOfRealData > lastOccurenceOfRealData)
return inputText;
withoutInvalidXmlCharsBuilder.Append(inputText.Substring(0, firstOccurenceOfRealData));
int remaining = lastOccurenceOfRealData - firstOccurenceOfRealData;
string textToCheckFor = inputText.Substring(firstOccurenceOfRealData, remaining);
foreach (char c in textToCheckFor)
{
withoutInvalidXmlCharsBuilder.Append((XmlConvert.IsXmlChar(c)) ? c : ' ');
}
withoutInvalidXmlCharsBuilder.Append(inputText.Substring(lastOccurenceOfRealData));
return withoutInvalidXmlCharsBuilder.ToString();
}
If I replaces the problematic character manually using notepad++, then the file opens fine in MSExcel. The above mentioned code successfully replaces the same invalid character and writes the content back to the file. However, when I try to open the excel file using MS Excel, it throws an error saying that file may have been corrupted and no content is displayed (snapshots below). Moreover, Following code
var excelPackage = new ExcelPackage(new FileInfo(filePath));
on the file that I updated via Notepad++, throws following exception
"CRC error: the file being extracted appears to be corrupted. Expected 0x7478AABE, Actual 0xE9191E00"}
My Questions:
Is my approach to modify content this way correct?
If yes, How can I write updated string to an Excel file?
If my approach is wrong then, How can I proceed to get rid of invalid XML chars?
Errors shown on opening file (without invalid XML char):
First Pop up
When I click on yes
Thanks in advance !

It does sounds like a binary (presumable XLSX) file based on your last comment. To confirm, open the file created by the FlexMonster with 7zip. If it opens properly and you see a bunch of XML files in folders, its a XLSX.
In that case, a search/replace on a binary file sounds like a very bad idea. It might work on the XML parts but might also replace legit chars in other parts. I think the better approach would be to do as #PanagiotisKanavos suggests and use ZipArchive. But you have to do rebuild it in the right order otherwise Excel complains. Similar to how it was done here https://stackoverflow.com/a/33312038/1324284, you could do something like this:
public static void ReplaceXmlString(this ZipArchive xlsxZip, FileInfo outFile, string oldString, string newstring)
{
using (var outStream = outFile.Open(FileMode.Create, FileAccess.ReadWrite))
using (var copiedzip = new ZipArchive(outStream, ZipArchiveMode.Update))
{
//Go though each file in the zip one by one and copy over to the new file - entries need to be in order
foreach (var entry in xlsxZip.Entries)
{
var newentry = copiedzip.CreateEntry(entry.FullName);
var newstream = newentry.Open();
var orgstream = entry.Open();
//Copy non-xml files over
if (!entry.Name.EndsWith(".xml"))
{
orgstream.CopyTo(newstream);
}
else
{
//Load the xml document to manipulate
var xdoc = new XmlDocument();
xdoc.Load(orgstream);
var xml = xdoc.OuterXml.Replace(oldString, newstring);
xdoc = new XmlDocument();
xdoc.LoadXml(xml);
xdoc.Save(newstream);
}
orgstream.Close();
newstream.Flush();
newstream.Close();
}
}
}
When it is used like this:
[TestMethod]
public void ReplaceXmlTest()
{
var datatable = new DataTable("tblData");
datatable.Columns.AddRange(new[]
{
new DataColumn("Col1", typeof (int)),
new DataColumn("Col2", typeof (int)),
new DataColumn("Col3", typeof (string))
});
for (var i = 0; i < 10; i++)
{
var row = datatable.NewRow();
row[0] = i;
row[1] = i * 10;
row[2] = i % 2 == 0 ? "ABCD" : "AXCD";
datatable.Rows.Add(row);
}
using (var pck = new ExcelPackage())
{
var workbook = pck.Workbook;
var worksheet = workbook.Worksheets.Add("source");
worksheet.Cells.LoadFromDataTable(datatable, true);
worksheet.Tables.Add(worksheet.Cells["A1:C11"], "Table1");
//Now similulate the copy/open of the excel file into a zip archive
using (var orginalzip = new ZipArchive(new MemoryStream(pck.GetAsByteArray()), ZipArchiveMode.Read))
{
var fi = new FileInfo(#"c:\temp\ReplaceXmlTest.xlsx");
if (fi.Exists)
fi.Delete();
orginalzip.ReplaceXmlString(fi, "AXCD", "REPLACED!!");
}
}
}
Gives this:
Just keep in mind that this is completely brute force. Anything you can do to make the file filter smarter rather then simply doing ALL xml files would be a very good thing. Maybe limit it to the SharedString.xml file if that is where the problem lies or in the xml files in the worksheet folders. Hard to say without knowing more about the data.

VSTO: how to save an interop doc (with custom parts and metadata) to memory

In order to create some custom metadata and back up what a user is doing in a word document to a server/database, I have created a VSTO application-level add-in, and used the DocumentBeforeSave event to hijack Word's default save functionality.
I would like to convert the current document into a binary blob or a complete openXML representation that contains the document, custom xml part, and with all the data that would be necessary to seamlessly open the same document from the server copy. I therefore need not just any custom XML parts I add, but information on Change Tracking and other metadata that is saved inside the document. My idea, accordingly, was to simply grab the saved blob that is created:
private void ThisAddIn_Startup(object sender, EventArgs e)
{
Application.DocumentBeforeSave += application_DocumentBeforeSave;
}
private void application_DocumentBeforeSave(Document doc, ref bool saveAsUI, ref bool cancel)
{
// generate some xml
string customPart = #"<foo>some xml here</foo>";
Office.CustomXMLPart rangeListXmlPart = doc.CustomXMLParts.Add(customPart, missing);
// suppress default save functionality
saveAsUI = false;
cancel = true;
// manually generate save dialog
Dialog dlg = Application.Dialogs[WdWordDialog.wdDialogFileSaveAs]
object oDlg = dlg;
object[] oArgs = new object[1];
oArgs[0] = #"C:\";
oDlg.GetType().InvokeMember("Name", BindingFlags.SetProperty, null, dlg, oArgs);
dlg.Show(ref missing);
// read in file blob
byte[] data = null;
FileInfo fileDetails = new FileInfo(doc.FullName);
long fileSize = fileDetails.Length;
FileStream fStream = new FileStream(path, FileMode.Open, FileAccess.Read);
BinaryReader bReader = new BinaryReader(fStream);
data = bReader.ReadBytes((int) fileSize);
// send data up to the server, along with the file type
}
... but there has to be a more elegant solution to the problem, that doesn't require saving the document to disk, and then reading it back into memory, as this approach is inherently flawed: the document saving may happen many times, and it is not desirable to keep reading from the hard drive multiple times. It also would be helpful to implement this functionality at other times without saving the document to disk at all! Any thoughts would be greatly appreciated.

Get the WordOpenXML property from the document or range (it contains the flat OPC format of the document), then convert it to a DocX package as shown in http://blogs.msdn.com/b/ericwhite/archive/2008/09/29/transforming-flat-opc-format-to-open-xml-documents.aspx.
The result should be equivalent to saving as DocX, but can be done entirely in memory.

Using FileHelpers to import Excel data using MVC 5

I'm trying to write an application in MVC 5 that will accept a file specified by a user and upload that file information into the database. The file itself has multiple worksheets, which I think FileHelpers handles gracefully, but I can't find any good documentation about working with a byte array. I can get the file just fine, and get to my controller, but don't know where to go from there. I am currently doing this in the controller:
public ActionResult UploadFile(string filepath)
{
//we want to check here that the first file in the request is not null
if (Request.Files[0] != null)
{
var file = Request.Files[0];
byte[] data = new byte[file.ContentLength];
ParseInputFile(data);
//file.InputStream.Read(data, 0, data.Length);
}
ViewBag.Message = "Success!";
return View("Index");
}
private void ParseInputFile(byte[] data)
{
ExcelStorage provider = new ExcelStorage(typeof(OccupationalGroup));
provider.StartRow = 3;
provider.StartColumn = 2;
provider.FileName = "test.xlsx";
}
Am I able to use the Request like that in conjunction with FileHelpers? I just need to read the Excel file into the database. If not, should I be looking into a different way to handle the upload?

So, I decided instead to use ExcelDataReader to do my reading from Excel. It puts the stream (in the below code, test) into a DataSet that I can just manipulate manually. I'm sure it might not be the cleanest way to do it, but it made sense for me, and allows me to work with multiple worksheets fairly easily as well. Here is the snippet of regular code that I ended up using:
//test is a stream here that I get using reflection
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(test);
DataSet result = excelReader.AsDataSet();
while(excelReader.Read())
{
//process the file
}
excelReader.Close();

third party tool to conver XML Excel spreadsheets into PDF

I'm using CarlosAg ExcelXmlWriter library which generates XML Excel spreadsheets 2003 (*.xml)
I need to find coomercial or free tool just that converts this generated xml spreadsheet into
PDF. I tried SautinSoft library but it didn't work with my desired extension (xml) it only works with xlsx or xls extesnions
thanks guys in advance

Try Aspose.
http://www.aspose.com/categories/.net-components/aspose.cells-for-.net/default.aspx
You might also need the PDF component, not sure how they do it now.

Can you simply use some pdf printer to do it?

Try to use a free solution (EpPlus): https://github.com/EPPlusSoftware/EPPlus
Or SpreadSheet https://spreadsheetlight.com/
An another way:
static void ConvertFromStream()
{
// The conversion process will be done completely in memory.
string inpFile = #"..\..\..\example.xml";
string outFile = #"ResultStream.pdf";
byte[] inpData = File.ReadAllBytes(inpFile);
byte[] outData = null;
using (MemoryStream msInp = new MemoryStream(inpData))
{
// Load a document.
DocumentCore dc = DocumentCore.Load(msInp, new XMLLoadOptions());
// Save the document to PDF format.
using (MemoryStream outMs = new MemoryStream())
{
dc.Save(outMs, new PdfSaveOptions() );
outData = outMs.ToArray();
}
// Show the result for demonstration purposes.
if (outData != null)
{
File.WriteAllBytes(outFile, outData);
System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(outFile) { UseShellExecute = true });
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

OpenXML SDK Inject VBA into excel workbook - c#

Related

Is there a way to format an Aspose.Cells generated excel sheet without calling Workbook.Save()?

Replacing Invalid XML characters from an excel file and writing it back to disk causes file is corrupted error on opening in MS Excel

VSTO: how to save an interop doc (with custom parts and metadata) to memory

Using FileHelpers to import Excel data using MVC 5

third party tool to conver XML Excel spreadsheets into PDF

Categories

Resources