I am trying to add some core properties to the Docx document. I have found only one example in different places of how it can be done.
For instance here. But there is a problem.
If we look at the structure of the Docx itself created by Word application and using OpenXml, there is a difference between them.
Structure of the docx created using openxml and document.PackageProperties.Creator = "vso"
Moreover, validation of the file can't be succeeded if I want to check the file by productivity tool from Microsoft. Of course, the word can read this file, but it is not a proper way to generate a word file from my point of view.
Here you can see the structure of the docx created by the word application itself
One more aspect, if I write following:
CoreFilePropertiesPart corePackageProperties = document.CoreFilePropertiesPart;
if (corePackageProperties == null)
{
corePackageProperties = document.AddCoreFilePropertiesPart();
}
then core.xml file is created in the proper place of structure, but it is empty.
So, the question is does OpenXML SDK have the way to get the structure of the docx the same as using the word application itself?
Microsoft documentation suggests :
using (XmlTextWriter writer = new XmlTextWriter(coreFilePropPart.GetStream(FileMode.Create), System.Text.Encoding.UTF8))
{
writer.WriteRaw("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<cp:coreProperties xmlns:cp=\"https://schemas.openxmlformats.org/package/2006/metadata/core-properties\"></cp:coreProperties>");
writer.Flush();
}
I had the same issue when creating an Excel file and this sort it out.
Related
I am using WordprocessingDocument to Read and write content to a word document but when I am opening the document using MemoryStream, it is not showing me the images and header/footer which is already in the word document. Below is the code for the same.
private void AddReport(MainDocumentPart parent, MemoryStream report)
{
using (MemoryStream editingMemoryStream = new MemoryStream())
{
report.Position = 0;
report.CopyTo(editingMemoryStream);
editingMemoryStream.Position = 0;
using (WordprocessingDocument newDoc = WordprocessingDocument.Open(editingMemoryStream, true))
{
WP.Body Template = newDoc.MainDocumentPart.Document.Body;
var Main = newDoc.MainDocumentPart;
var cloneTemplate = Template.CloneNode(true);
parent.Document.Body.PrependChild(new WP.Paragraph(new WP.Run(cloneTemplate)));
parent.Document.Save();
}
}
}
Screenshot for the word document:
enter image description here
In this, the Parent document is the document where I am pre-pending the above document. Any help will be appreciated. Thanks in advance.
The headers, footers and images are not part of the document body, so won't be carried over to another document in the described scenario.
All this information is stored in separate "xml parts" contained within the Word file's "zip package". The Body part contains only refereces (relationship IDs listed in a "rel" part that point/link to the relevant xml part contained in the package).
This can be seen by opening the document in the Open XML SDK Productivity Tool and inspecting the underlying Word Open XML.
In order to copy such content to another document it's necessary to clone not only the body, but also each and every relevant xml part with content you want to have, while dynamically generating the necessary relationships - not a trivial undertaking. There are posts here and elsewhere in the Internet (including my blog, the WordMeister) that demonstrate the basics of how this is done which you could use as starting points for understanding the required approach.
Or, depending on what the Parent document is, it might make more sense to start with a copy of the "new" document and edit it with the other content.
FWIW and mentioned here for the sake of completeness: The COM object model will do what is described - copy the body of the document and paste to another document does carry over all this information. But the Word application is doing all the "heavy lifting" that the developer needs to code when using the Open XML SDK.
I have to use OpenXML SDK 2.5 with C# to copy formulas from one word document then append them to another word document. I tried the below code, it ran successfully but when I tried to open the file, it said there's something wrong with the content. I opened it ignoring the warning but those formulas were not displayed. They are just blank blocks.
My code:
private void CreateNewWordDocument(string document, Exercise[] exercices)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Create(document, WordprocessingDocumentType.Document))
{
// Set the content of the document so that Word can open it.
MainDocumentPart mainPart = wordDoc.AddMainDocumentPart();
SetMainDocumentContent(mainPart);
foreach (Exercise ex in exercices)
{
wordDoc.MainDocumentPart.Document.Body.AppendChild(ex.toParagraph().CloneNode(true));
}
wordDoc.MainDocumentPart.Document.Save();
}
}
// Set content of MainDocumentPart.
private void SetMainDocumentContent(MainDocumentPart part)
{
string docXml =
#"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
<w:document xmlns:w=""http://schemas.openxmlformats.org/wordprocessingml/2006/main"">
<w:body><w:p><w:r><w:t>Exercise list!</w:t></w:r></w:p></w:body>
</w:document>";
using (Stream stream = part.GetStream())
{
byte[] buf = (new UTF8Encoding()).GetBytes(docXml);
stream.Write(buf, 0, buf.Length);
}
}
This happens because not everything that can be referenced in the paragraph is copied when you clone the paragraph. The Word XML format consists of multiple files some of which reference each other. If you copy the paragraph from one document to another you need to also copy any relationships that may exist.
The OpenXML Productivity Tool is useful for diagnosing errors like these. You can open a document with the tool and ask it to validate the document.
I created a test document that just contained a hyperlink and ran your code to copy the contents to another document. I too got an error when I attempted to load it using Word so I opened it in the Productivity Tool and saw the following output:
This shows that the hyperlink is stored as a relationship rather than inline in the paragraph and my new file references a relationship that doesn't exist. Unzipping the original file and the new file and comparing the two shows what is going on:
document.xml from original:
.rels of original
document.xml of generated file
.rels of generated file
Note that in the generated file the hyperlink references relationship rId5 but that doesn't exist in the generated documents relationship file.
It's worth noting that for simple source documents the code worked without issue as there are no relationships that require copying.
There are two ways that you can solve this. The easiest way is to only copy the text of the paragraph (you'll lose all styles, images, hyperlinks etc) but it is very simple. All you need to do is change
wordDoc.MainDocumentPart.Document.Body.AppendChild(ex.toParagraph().CloneNode(true));
for
Paragraph para = wordDoc.MainDocumentPart.Document.Body.AppendChild(new Paragraph());
Run run = para.AppendChild(new Run());
run.AppendChild(new Text(ex.toParagraph().InnerText));
The more complex (and perhaps proper) way of achieving it is to find the relationships and copy them to the new document as well. The code for doing that is probably beyond the scope of what I can write here but there is an interesting article on the subject here http://blogs.msdn.com/b/ericwhite/archive/2009/02/05/move-insert-delete-paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx.
Essentially the author of that blog post is using the Powertools for OpenXML to find relationships and copy them from one document to another.
I am opening existing .docx files from a SharePoint Document Library over the SharePoint web services, and am attempting to attach a new Template to them. The current code for this piece seems to not be doing anything at all.
XNamespace w = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
OpenXmlPart documentSettingsPart = document.MainDocumentPart.DocumentSettingsPart;
XDocument documentSettingsXDoc = documentSettingsPart.GetXDocument();
documentSettingsPart.AddExternalRelationship("http://schemas.openxmlformats/org/officeDocument/2006/relationships/attachedTemplate", new Uri(_outLibraryTemplate, UriKind.Absolute));
using (XmlWriter xw = XmlWriter.Create(documentSettingsPart.GetStream(FileMode.Create, FileAccess.Write)))
documentSettingsXDoc.Save(xw);
Does anyone have any thoughts as to why this isn't working - and what I need to do to get this going?
This may help. It creates a new docx file from a dotx file.
I modified it a little for my own use - I added the external relationship (a dotm) to an existing file. Unfortunately I can't work out yet if I can easily programatically update the styles without having to actually open the file.
https://web.archive.org/web/20150716111136/http://blogs.msdn.com/b/vsod/archive/2012/02/18/how-to-create-a-document-from-a-template-dotx-dotm-and-attach-to-it-using-open-xml-sdk.aspx
I am just starting out with the OpenXML SDK 2.0 in Visual Studio 2010 (C#). I have automated office programs before using COM automation, which was painful.
I have a template made by one of our graphic designers, which will provide the foundation for my reports. In order to automate the simple things (plaintext items) I have added content controls to the template and bound a custom XML part to the doc. The content controls are as follows:
DayCount
AlternateJobTitle
Date
SignatureName
After making a copy of the template, I then edit the content controls and save the file with the following code:
//stand up object that reads the Word doc package
using (WordprocessingDocument doc = WordprocessingDocument.Open(docOutputPath, true))
{
//create XML string matching custom XML part
string newXml = "<root>" +
"<DayCount>42</DayCount>" +
"<AlternateJobTitle>Supervisor</AlternateJobTitle>" +
"<Date>9/24/2012</Date>" +
"<SignatureName>John Doe</SignatureName>" +
"</root>";
MainDocumentPart main = doc.MainDocumentPart;
main.DeleteParts<CustomXmlPart>(main.CustomXmlParts);
//add and write new XML part
CustomXmlPart customXml = main.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter ts = new StreamWriter(customXml.GetStream()))
{
ts.Write(newXml);
}
}
This all works well. However, my document is not made up solely of standard text and plaintext updates. The real meat of the report is in a number of tables that need to be added to each report as well. I have been searching like crazy for a good description on how this is done, but have really not found anything. Is there some way to delineate where to place a table using the same content control logic used for plaintext controls? Any code samples I have found of creating a table using OpenXML have just assumed that you want to append it to the end of the main document part. I would like to specify where the tables need to go in the template, generate the tables and place them in the specified regions of the template. Is this possible?
Any help is greatly appreciated.
There are a lot of OpenXml creation questions. But if you decide to take this path - answer is general - examine OpenXml Productivity Tool. At my PC it could be found at "C:\Program Files (x86)\Open XML SDK\V2.0\tool\OpenXmlSdkTool.exe". Just create in MsWord document which you want to create using OpenXml and reflect document's code using this tool. Good luck!
If you need to display tabled data, so far, the best thing I found is Word Document Generator at http://worddocgenerator.codeplex.com/.
I have a request to create a word document on the fly based on a template provided to me. I have done some research and everything seems to point at OpenXML. I have looked into that, but the cs file that gets created is over 15k lines and is breaking my VS 2010 (causing it to not respond every time I make a change).
I have been looking at this tutorial series on Open XML
http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/10/13/getting-started-with-open-xml-development.aspx
I have done things in the past with text files and Regular Expressions, but since Word encrypts everything, that does not work. Are there any other options that are fairly lightweight for creating word documents from templates.
//Hi, It is quite simple.
//First, you should copy your Template file into another location.
string SourcePath = "C:\\MyTemplate.dotx";
string DestPath = "C:\\MyDocument.docx";
System.IO.File.Copy(SourcePath, DestPath);
//After copying the file, you can open a WordprocessingDocument using your Destination Path.
WordprocessingDocument Mydoc = WordprocessingDocument.Open(DestPath, true);
//After openning your document, you can change type of your document after adding additional parts into your document.
mydoc.ChangeDocumentType(WordprocessingDocumentType.Document);
//If you wish, you can edit your document
AttachedTemplate attachedTemplate1 = new AttachedTemplate() { Id = "MyRelationID" };
MainDocumentPart mainPart = mydoc.MainDocumentPart;
MySettingsPart = mainPart.DocumentSettingsPart;
MySettingsPart.Settings.Append(attachedTemplate1);
MySettingsPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(CopyPath, UriKind.Absolute), "MyRelationID");
//Finally you can save your document.
mainPart.Document.Save();
I am currently working on something along these lines and I have been making use of the Open XML SDK and the OpenXmlPowerTools The approach been taken is taking the actual template file opening it up and putting text into various place holders within the template document. I have been using content controls as the place markers.
The SDK tool to open up a document has been invaluable in being able to compare documents and see how it is constructed. However the code generated from the tool I have been refactoring heavily and removing sections that are not being used at all.
I can't talk about doc files but with docx files they are not encrypted they are just zip files that contain xml files
Eric White's blog has a large number of examples and code samples which have been very useful