C# OpenXML Word - How to create a VBA macro? - c#

I'm trying to create a macro in a new dotm word file created with OpenXML. I guess I have to add a VBAProjectPart but I can not go on.
The macro is stored in a string variable : for example
string tmpMacro = "Private Sub Add_Pages()\nDim tmpPages As Integer\ntmpPages = Selection.Information(wdNumberOfPagesInDocument)\nSelection.EndKey Unit:= wdStory\nDo While Selection.Information(wdNumberOfPagesInDocument) < 10\nSelection.InsertBreak(wdPageBreak)\nLoop\nEnd Sub";
WordprocessingDocument tmpWD = WordprocessingDocument.Create("myDoc.docm", DocumentFormat.OpenXml.WordprocessingDocumentType.MacroEnabledDocument);
MainDocumentPart tmpWMP = tmpWGD.AddMainDocumentPart();
tmpMDP.Document = new Document(new Body());
tmpWD.Close();

In OpenXML, macros are a combination of binary format and XML relation files.
To verify this for yourself, create a new Word/Excel file, create a new macro, and save it as a macro-enabled document/workbook. Close the file and rename it to end with .zip.
In the main directory, you will find the file [Content_Types].xml, inside of which there are two relation pointers:
<Default Extension="bin"
ContentType="application/vnd.ms-office.vbaProject"/>
<Override PartName="/word/vbaData.xml"
ContentType="application/vnd.ms-word.vbaData+xml"/>
To follow these files, locate word/vbaData.xml, inside of which there will be something like:
<wne:vbaSuppData ...namespaces ommitted... >
<wne:mcds>
<wne:mcd wne:macroName="PROJECT.NEWMACROS.MACRO1"
wne:name="Project.NewMacros.Macro1"
wne:bEncrypt="00"
wne:cmg="56"/>
</wne:mcds>
</wne:vbaSuppData>
This is shows that there is some macro named Project.NewMacros.Macro1, but little else. So let's look inside of word/_rels/document.xml.rels:
<Relationships ...namespaces ommitted...>
<Relationship Id="rId1"
Type="http://schemas.microsoft.com/office/2006/relationships/vbaProject"
Target="vbaProject.bin"/>
...other relationships ommitted...
</Relationships>
This points to word/vbaProject.bin, which is a binary file format.
If you need to add this macro programmatically (e.g. you cannot set everything else up, and add the macro manually), then you could create the macro in a one document manually, and then programmatically copy the binary stream from the manually created vbaProject.bin file into a new vbaProject.bin file.
If you decide to follow the stream copy approach, the answer to this question includes a snippet demonstrating one way to do so.

Related

OpenXml. How to add creator using C# in docx?

I am trying to add some core properties to the Docx document. I have found only one example in different places of how it can be done.
For instance here. But there is a problem.
If we look at the structure of the Docx itself created by Word application and using OpenXml, there is a difference between them.
Structure of the docx created using openxml and document.PackageProperties.Creator = "vso"
Moreover, validation of the file can't be succeeded if I want to check the file by productivity tool from Microsoft. Of course, the word can read this file, but it is not a proper way to generate a word file from my point of view.
Here you can see the structure of the docx created by the word application itself
One more aspect, if I write following:
CoreFilePropertiesPart corePackageProperties = document.CoreFilePropertiesPart;
if (corePackageProperties == null)
{
corePackageProperties = document.AddCoreFilePropertiesPart();
}
then core.xml file is created in the proper place of structure, but it is empty.
So, the question is does OpenXML SDK have the way to get the structure of the docx the same as using the word application itself?
Microsoft documentation suggests :
using (XmlTextWriter writer = new XmlTextWriter(coreFilePropPart.GetStream(FileMode.Create), System.Text.Encoding.UTF8))
{
writer.WriteRaw("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<cp:coreProperties xmlns:cp=\"https://schemas.openxmlformats.org/package/2006/metadata/core-properties\"></cp:coreProperties>");
writer.Flush();
}
I had the same issue when creating an Excel file and this sort it out.

Copy entire content from one Word document to another using C#

I am trying to copy entire content (including page numbers, and page layout) from a word document to another, using Microsoft.Office.Interop.Word.
I cannot use SaveAs method because the document in which I want to paste the contents is already created and it contains VBA code.
Also, I cannot use XML related code because the document in which I am copying the content is in the older format. This document is part of an old way of uploading a document to a server database, using VBA code.
Using VBA code, I can copy the entire content without any issue.
Selection.WholeStory
Selection.Copy
Windows("document.doc").Activate
Selection.WholeStory
Selection.PasteAndFormat (wdFormatOriginalFormatting)
For C#, I used Microsoft.Office.Interop.Word to replicate the VBA code.
Word.Application objWordOpen = new Word.Application();
objWordOpen.Visible = false;
Word.Document doclocal = objWordOpen.Documents.Open(filepath);
doclocal.ActiveWindow.Selection.WholeStory();
doclocal.ActiveWindow.Selection.Copy();
Document d1 = objWordOpen.Documents.Open(filepath2);
d1.Activate();
d1.ActiveWindow.Selection.WholeStory();
d1.ActiveWindow.Selection.PasteAndFormat(Word.WdRecoveryType.wdFormatOriginalFormatting);
I have also tried using range
Word.Range oRange = doclocal.Content;
oRange.Copy();
The content is copied into the document, but without headers and footers. Also, when using Selection.WholeStory() approach, the page margins settings don't get copied.
What changes should I make to the c# code in order to achieve my result?
MS Office applications have complicated relationships with the clipboard. Between various optimisations that may lead to cryptic prompts, and numerous formats they support, it is best to not do anything remotely funny between a Copy and a Paste.
The VBA code follows this advice, the C# code opens a document between copying and pasting.
Make sure you open the documents in advance and not in the middle of a copypaste.

How can I use OpenXML SDK 2.5 to copy formulas from a word document?

I have to use OpenXML SDK 2.5 with C# to copy formulas from one word document then append them to another word document. I tried the below code, it ran successfully but when I tried to open the file, it said there's something wrong with the content. I opened it ignoring the warning but those formulas were not displayed. They are just blank blocks.
My code:
private void CreateNewWordDocument(string document, Exercise[] exercices)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Create(document, WordprocessingDocumentType.Document))
{
// Set the content of the document so that Word can open it.
MainDocumentPart mainPart = wordDoc.AddMainDocumentPart();
SetMainDocumentContent(mainPart);
foreach (Exercise ex in exercices)
{
wordDoc.MainDocumentPart.Document.Body.AppendChild(ex.toParagraph().CloneNode(true));
}
wordDoc.MainDocumentPart.Document.Save();
}
}
// Set content of MainDocumentPart.
private void SetMainDocumentContent(MainDocumentPart part)
{
string docXml =
#"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
<w:document xmlns:w=""http://schemas.openxmlformats.org/wordprocessingml/2006/main"">
<w:body><w:p><w:r><w:t>Exercise list!</w:t></w:r></w:p></w:body>
</w:document>";
using (Stream stream = part.GetStream())
{
byte[] buf = (new UTF8Encoding()).GetBytes(docXml);
stream.Write(buf, 0, buf.Length);
}
}
This happens because not everything that can be referenced in the paragraph is copied when you clone the paragraph. The Word XML format consists of multiple files some of which reference each other. If you copy the paragraph from one document to another you need to also copy any relationships that may exist.
The OpenXML Productivity Tool is useful for diagnosing errors like these. You can open a document with the tool and ask it to validate the document.
I created a test document that just contained a hyperlink and ran your code to copy the contents to another document. I too got an error when I attempted to load it using Word so I opened it in the Productivity Tool and saw the following output:
This shows that the hyperlink is stored as a relationship rather than inline in the paragraph and my new file references a relationship that doesn't exist. Unzipping the original file and the new file and comparing the two shows what is going on:
document.xml from original:
.rels of original
document.xml of generated file
.rels of generated file
Note that in the generated file the hyperlink references relationship rId5 but that doesn't exist in the generated documents relationship file.
It's worth noting that for simple source documents the code worked without issue as there are no relationships that require copying.
There are two ways that you can solve this. The easiest way is to only copy the text of the paragraph (you'll lose all styles, images, hyperlinks etc) but it is very simple. All you need to do is change
wordDoc.MainDocumentPart.Document.Body.AppendChild(ex.toParagraph().CloneNode(true));
for
Paragraph para = wordDoc.MainDocumentPart.Document.Body.AppendChild(new Paragraph());
Run run = para.AppendChild(new Run());
run.AppendChild(new Text(ex.toParagraph().InnerText));
The more complex (and perhaps proper) way of achieving it is to find the relationships and copy them to the new document as well. The code for doing that is probably beyond the scope of what I can write here but there is an interesting article on the subject here http://blogs.msdn.com/b/ericwhite/archive/2009/02/05/move-insert-delete-paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx.
Essentially the author of that blog post is using the Powertools for OpenXML to find relationships and copy them from one document to another.

Read file details (not language dependent)

In C#, I would like to read file details from a specific file.
I've found an interesting thread: Read/Write 'Extended' file properties (C#)
it uses a call to the GetDetailsOf() method on the folder shell object included in shell32.dll.
It works fine but I have an issue: According to the Operating System language, the header string is never the same...('Name' for the filename property on an english Windows, 'Nom' on a french Windows).
So, it's not easy to retrieve specific values with the name of the property as it changes according to the language...
Is there a way to handle this easily?
Some properties are available through the FileInfo object. For example, if you want the creation time of the file you can do:
Fileinfo myFileInfo = new Fileinfo(#"C:\path\to\file");
DateTime ftime = myFileInfo.CreationTime;
Is the FileInfo class not enough for your needs ?
FileInfo info = new FileInfo("fileName");
var name = info.Name;
var creationTime = info.CreationTime;
// etc ...
If not, tell more about which properties you'd like to read from your file.
Update to my answer :
I don't know about a library that would allow to read any type of document properties'
But here are a few ways for the formats you said,
PDF :
Extracting Additional Metadata from a PDF using iTextSharp
Read/Modify PDF Metadata using iTextSharp
So, iText ® is a library that allows you to create and manipulate PDF documents (from their website)
Office : (first link from MS stipulates that it applies to Word as well as Excel documents)
How to: Read from and Write to Document Properties
Listing properties of a word document in C#

Build Word Document from template

I have a request to create a word document on the fly based on a template provided to me. I have done some research and everything seems to point at OpenXML. I have looked into that, but the cs file that gets created is over 15k lines and is breaking my VS 2010 (causing it to not respond every time I make a change).
I have been looking at this tutorial series on Open XML
http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/10/13/getting-started-with-open-xml-development.aspx
I have done things in the past with text files and Regular Expressions, but since Word encrypts everything, that does not work. Are there any other options that are fairly lightweight for creating word documents from templates.
//Hi, It is quite simple.
//First, you should copy your Template file into another location.
string SourcePath = "C:\\MyTemplate.dotx";
string DestPath = "C:\\MyDocument.docx";
System.IO.File.Copy(SourcePath, DestPath);
//After copying the file, you can open a WordprocessingDocument using your Destination Path.
WordprocessingDocument Mydoc = WordprocessingDocument.Open(DestPath, true);
//After openning your document, you can change type of your document after adding additional parts into your document.
mydoc.ChangeDocumentType(WordprocessingDocumentType.Document);
//If you wish, you can edit your document
AttachedTemplate attachedTemplate1 = new AttachedTemplate() { Id = "MyRelationID" };
MainDocumentPart mainPart = mydoc.MainDocumentPart;
MySettingsPart = mainPart.DocumentSettingsPart;
MySettingsPart.Settings.Append(attachedTemplate1);
MySettingsPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(CopyPath, UriKind.Absolute), "MyRelationID");
//Finally you can save your document.
mainPart.Document.Save();
I am currently working on something along these lines and I have been making use of the Open XML SDK and the OpenXmlPowerTools The approach been taken is taking the actual template file opening it up and putting text into various place holders within the template document. I have been using content controls as the place markers.
The SDK tool to open up a document has been invaluable in being able to compare documents and see how it is constructed. However the code generated from the tool I have been refactoring heavily and removing sections that are not being used at all.
I can't talk about doc files but with docx files they are not encrypted they are just zip files that contain xml files
Eric White's blog has a large number of examples and code samples which have been very useful

Categories

Resources