C# Populate word template with OpenXml - c#

I would like to create a new Word document which is based on a template with content controls.
I need to fill those contents controls (text).
I only found how to generate a new Word document but not bases on a template.
Do you have any link or tutorial ?
Maybe am I wrong using OpenXml to fill a template ?
// Create a Wordprocessing document.
using (WordprocessingDocument myDoc =
WordprocessingDocument.Create("d:/dev/test.docx",
WordprocessingDocumentType.Document))
{
// Add a new main document part.
MainDocumentPart mainPart = myDoc.AddMainDocumentPart();
//Create Document tree for simple document.
mainPart.Document = new Document();
//Create Body (this element contains
//other elements that we want to include
Body body = new Body();
//Create paragraph
Paragraph paragraph = new Paragraph();
Run run_paragraph = new Run();
// we want to put that text into the output document
Text text_paragraph = new Text("Hello World!");
//Append elements appropriately.
run_paragraph.Append(text_paragraph);
paragraph.Append(run_paragraph);
body.Append(paragraph);
mainPart.Document.Append(body);
// Save changes to the main document part.
mainPart.Document.Save();
}

Going OpenXML is the right approach, because you can manipulate documents without the need to have MS Word installed – think server scenarios. However, its learning curve is steep and tedious. In my humble opinion, the best way is to ask for a budget to purchase a 3rd party toolkit and focus on the business domain and not on OpenXML tweaks.
In your example, you have a template in Word and want to update (merge) it with data. I have done this with Docentric Toolkit for some of our scenarios and it works very well. Once you understand the basics of how it works you create solutions quickly. If you get stuck guys at DT won’t let you down. See this link (http://www.codeproject.com/Articles/759408/Creating-Word-documents-in-Net-using-Docentric-Too) to get the idea on how you can use it. By default it creates .docx, but you can get fixed final documents (pdf or xps) as well.

Related

Word document not displaying header, footer and images when using WordProcessingDocument

I am using WordprocessingDocument to Read and write content to a word document but when I am opening the document using MemoryStream, it is not showing me the images and header/footer which is already in the word document. Below is the code for the same.
private void AddReport(MainDocumentPart parent, MemoryStream report)
{
using (MemoryStream editingMemoryStream = new MemoryStream())
{
report.Position = 0;
report.CopyTo(editingMemoryStream);
editingMemoryStream.Position = 0;
using (WordprocessingDocument newDoc = WordprocessingDocument.Open(editingMemoryStream, true))
{
WP.Body Template = newDoc.MainDocumentPart.Document.Body;
var Main = newDoc.MainDocumentPart;
var cloneTemplate = Template.CloneNode(true);
parent.Document.Body.PrependChild(new WP.Paragraph(new WP.Run(cloneTemplate)));
parent.Document.Save();
}
}
}
Screenshot for the word document:
enter image description here
In this, the Parent document is the document where I am pre-pending the above document. Any help will be appreciated. Thanks in advance.
The headers, footers and images are not part of the document body, so won't be carried over to another document in the described scenario.
All this information is stored in separate "xml parts" contained within the Word file's "zip package". The Body part contains only refereces (relationship IDs listed in a "rel" part that point/link to the relevant xml part contained in the package).
This can be seen by opening the document in the Open XML SDK Productivity Tool and inspecting the underlying Word Open XML.
In order to copy such content to another document it's necessary to clone not only the body, but also each and every relevant xml part with content you want to have, while dynamically generating the necessary relationships - not a trivial undertaking. There are posts here and elsewhere in the Internet (including my blog, the WordMeister) that demonstrate the basics of how this is done which you could use as starting points for understanding the required approach.
Or, depending on what the Parent document is, it might make more sense to start with a copy of the "new" document and edit it with the other content.
FWIW and mentioned here for the sake of completeness: The COM object model will do what is described - copy the body of the document and paste to another document does carry over all this information. But the Word application is doing all the "heavy lifting" that the developer needs to code when using the Open XML SDK.

How can I use OpenXML SDK 2.5 to copy formulas from a word document?

I have to use OpenXML SDK 2.5 with C# to copy formulas from one word document then append them to another word document. I tried the below code, it ran successfully but when I tried to open the file, it said there's something wrong with the content. I opened it ignoring the warning but those formulas were not displayed. They are just blank blocks.
My code:
private void CreateNewWordDocument(string document, Exercise[] exercices)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Create(document, WordprocessingDocumentType.Document))
{
// Set the content of the document so that Word can open it.
MainDocumentPart mainPart = wordDoc.AddMainDocumentPart();
SetMainDocumentContent(mainPart);
foreach (Exercise ex in exercices)
{
wordDoc.MainDocumentPart.Document.Body.AppendChild(ex.toParagraph().CloneNode(true));
}
wordDoc.MainDocumentPart.Document.Save();
}
}
// Set content of MainDocumentPart.
private void SetMainDocumentContent(MainDocumentPart part)
{
string docXml =
#"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
<w:document xmlns:w=""http://schemas.openxmlformats.org/wordprocessingml/2006/main"">
<w:body><w:p><w:r><w:t>Exercise list!</w:t></w:r></w:p></w:body>
</w:document>";
using (Stream stream = part.GetStream())
{
byte[] buf = (new UTF8Encoding()).GetBytes(docXml);
stream.Write(buf, 0, buf.Length);
}
}
This happens because not everything that can be referenced in the paragraph is copied when you clone the paragraph. The Word XML format consists of multiple files some of which reference each other. If you copy the paragraph from one document to another you need to also copy any relationships that may exist.
The OpenXML Productivity Tool is useful for diagnosing errors like these. You can open a document with the tool and ask it to validate the document.
I created a test document that just contained a hyperlink and ran your code to copy the contents to another document. I too got an error when I attempted to load it using Word so I opened it in the Productivity Tool and saw the following output:
This shows that the hyperlink is stored as a relationship rather than inline in the paragraph and my new file references a relationship that doesn't exist. Unzipping the original file and the new file and comparing the two shows what is going on:
document.xml from original:
.rels of original
document.xml of generated file
.rels of generated file
Note that in the generated file the hyperlink references relationship rId5 but that doesn't exist in the generated documents relationship file.
It's worth noting that for simple source documents the code worked without issue as there are no relationships that require copying.
There are two ways that you can solve this. The easiest way is to only copy the text of the paragraph (you'll lose all styles, images, hyperlinks etc) but it is very simple. All you need to do is change
wordDoc.MainDocumentPart.Document.Body.AppendChild(ex.toParagraph().CloneNode(true));
for
Paragraph para = wordDoc.MainDocumentPart.Document.Body.AppendChild(new Paragraph());
Run run = para.AppendChild(new Run());
run.AppendChild(new Text(ex.toParagraph().InnerText));
The more complex (and perhaps proper) way of achieving it is to find the relationships and copy them to the new document as well. The code for doing that is probably beyond the scope of what I can write here but there is an interesting article on the subject here http://blogs.msdn.com/b/ericwhite/archive/2009/02/05/move-insert-delete-paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx.
Essentially the author of that blog post is using the Powertools for OpenXML to find relationships and copy them from one document to another.

copy individual pages from word document to new document c#

I have a report that has hundreds of pages. I need to create extract each individual page from this document into a new document. I have found that this is possible using INTEROP, however I'm trying avoid installing MS Office on the server. I've been using ASPOSE for most of the operations, but this functionality is doesn't appear to be supported.
Is there a way to seperate pages of a document into individual files without having MS Office Installed?
Aspose.Words does not have layout information like pages or line numbers. It maintains DOM. But we have written some utility classes to achieve such behavior. They split the word document into multiple sections, such that each page becomes one separate section. After that, it is easy to copy individual pages.
String sourceDoc = dataDir + "source.docx";
String destinationtDoc = dataDir + "destination.docx";
// Initialize the Document instance with source and destination documents
Document doc = new Document(sourceDoc);
Document dstDoc = new Document();
// Remove the blank default page from new document
dstDoc.RemoveAllChildren();
PageNumberFinder finder = new PageNumberFinder(doc);
// Split nodes across pages
finder.SplitNodesAcrossPages(true);
// Get separate page sections
ArrayList pageSections = finder.RetrieveAllNodesOnPages(1, 5, NodeType.Section);
foreach (Section section in pageSections)
dstDoc.AppendChild(dstDoc.ImportNode(section, true));
dstDoc.LastSection.Body.LastParagraph.Remove();
dstDoc.Save(destinationtDoc);
The PageNumberFinder class can be downloaded from here.
PS. I am a Developer Evangelist at Aspose.

Using the OpenXml SDK 2.0 to insert tables in a word document

I am just starting out with the OpenXML SDK 2.0 in Visual Studio 2010 (C#). I have automated office programs before using COM automation, which was painful.
I have a template made by one of our graphic designers, which will provide the foundation for my reports. In order to automate the simple things (plaintext items) I have added content controls to the template and bound a custom XML part to the doc. The content controls are as follows:
DayCount
AlternateJobTitle
Date
SignatureName
After making a copy of the template, I then edit the content controls and save the file with the following code:
//stand up object that reads the Word doc package
using (WordprocessingDocument doc = WordprocessingDocument.Open(docOutputPath, true))
{
//create XML string matching custom XML part
string newXml = "<root>" +
"<DayCount>42</DayCount>" +
"<AlternateJobTitle>Supervisor</AlternateJobTitle>" +
"<Date>9/24/2012</Date>" +
"<SignatureName>John Doe</SignatureName>" +
"</root>";
MainDocumentPart main = doc.MainDocumentPart;
main.DeleteParts<CustomXmlPart>(main.CustomXmlParts);
//add and write new XML part
CustomXmlPart customXml = main.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter ts = new StreamWriter(customXml.GetStream()))
{
ts.Write(newXml);
}
}
This all works well. However, my document is not made up solely of standard text and plaintext updates. The real meat of the report is in a number of tables that need to be added to each report as well. I have been searching like crazy for a good description on how this is done, but have really not found anything. Is there some way to delineate where to place a table using the same content control logic used for plaintext controls? Any code samples I have found of creating a table using OpenXML have just assumed that you want to append it to the end of the main document part. I would like to specify where the tables need to go in the template, generate the tables and place them in the specified regions of the template. Is this possible?
Any help is greatly appreciated.
There are a lot of OpenXml creation questions. But if you decide to take this path - answer is general - examine OpenXml Productivity Tool. At my PC it could be found at "C:\Program Files (x86)\Open XML SDK\V2.0\tool\OpenXmlSdkTool.exe". Just create in MsWord document which you want to create using OpenXml and reflect document's code using this tool. Good luck!
If you need to display tabled data, so far, the best thing I found is Word Document Generator at http://worddocgenerator.codeplex.com/.

How can I add an external image to a word document using OpenXml?

I am trying to use C# and Open XML to insert an image from a url into a doc. The image may change so I don't want to download it, I want it to remain an external reference.
I've found several examples like this one that allow me to add a local image:
http://msdn.microsoft.com/en-us/library/bb497430.aspx
How can I adapt that to take a URI? Or is there another approach altogether?
You can add an external image to an word document via a quick parts field.
For a description please see the following answer on superuser.
To realize the described steps programmatically you have to
use an external releationship to include an image from an URL.
Here are the steps to accomplish this:
Create an instance of the Picture class.
Add a Shape to specify the style of the picture (width/height).
Use the ImageData class to specify the ID of the external releationship.
Add an external releationship to the main document part. Give the external
releationship the same ID you specified in step 3.
The following code just implements the steps described above. The image is added to the
first paragraph in the word document.
using (WordprocessingDocument newDoc = WordprocessingDocument.Open(#"c:\temp\external_img.docx", true))
{
var run = new Run();
var picture = new Picture();
var shape = new Shape() { Id = "_x0000_i1025", Style = "width:453.5pt;height:270.8pt" };
var imageData = new ImageData() { RelationshipId = "rId56" };
shape.Append(imageData);
picture.Append(shape);
run.Append(picture);
var paragraph = newdoc.MainDocumentPart.Document.Body.Elements<Paragraph>().FirstOrDefault();
paragraph.Append(run);
newDoc.MainDocumentPart.AddExternalRelationship(
"http://schemas.openxmlformats.org/officeDocument/2006/relationships/image",
new System.Uri("<url to your picture>", System.UriKind.Absolute), "rId56");
}
In the code above I've omitted the code to define the shape type. I advise you to use a
tool like the OpenXML SDK productivity tool
to inspect a word document with an external releationship to an image.

Categories

Resources