Programatically filling content controls in Word document (OpenXML) in .NET

Programatically filling content controls in Word document (OpenXML) in .NET - c#

I have a really simple word document with Content Controls (all text).
I want to loop through the controls, filling them with values from a dictionary. Should be super simple, but something is wrong:
var myValues = new Dictionary<string, string>(); //And fill it
using (var wordDoc = WordprocessingDocument.Open(outputFile, true))
{
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
foreach(SdtElement sdt in mainPart.Document.Descendants<SdtElement>())
{
SdtAlias alias = sdt.Descendants<SdtAlias>().FirstOrDefault();
if (alias != null)
{
string sdtTitle = alias.Val.Value;
sdt.??? = myValues[sdtTitle];
}
}
mainPart.Document.Save();
}
How do I write my value into the document?
Do I need a CustomXmlPart?

If you are going to do something like that, you'll need to write suitable content into the Sdt's SdtContent: a paragraph or a run or a tc etc depending on the sdt's parent element.
The alternative is to put the contents of your dictionary into a CustomXml part, and set up databindings on each content control which refer to the relevant dictionary element. Word will then resolve the bindings when the docx is first opened (which is not much good to you if you expect your users to open it with something else).

You can use this code.
foreach (SdtElement sdt in mainPart.Document.Descendants<SdtElement>())
{
SdtAlias alias = sdt.Descendants<SdtAlias>().FirstOrDefault();
if (alias != null)
{
string sdtTitle = alias.Val.Value;
Text t = sdt.Descendants<Text>().First();
t.Text = "test";
}
}

Related

C# word document

I want to read a word document without opening it. Then replace some text in it and then save it with another name in the same format using C#. But the document is not being saved with the changes but int the same way as the original document. Any help will be appreciated. Thanks in advance.
string path = Server.MapPath("~/CustomerDocument/SampleNDA5.docx");
WordprocessingDocument wordprocessingDocument = WordprocessingDocument.Open(path, false);
{
var body = wordprocessingDocument.MainDocumentPart.Document.Body;
foreach (var text in body.Descendants<Text>())
{
if (text.Text.Contains("<var_Date>"))
{
text.Text = text.Text.Replace("<var_Date>", DateTime.Now.ToString("MMM dd,yyyy"));
}
}
wordprocessingDocument.SaveAs(Server.MapPath("~/CustomerDocument/SampleNDA10.docx"));
wordprocessingDocument.Close();
}

I think this should help. It explains how to put edits back into a collection.
https://www.codeproject.com/Articles/87616/List-T-ForEach-or-Foreach-It-Doesn-t-Matter-Or-Doe

So if your word template is the same each time you essentially
Copy The Template
Work On The Template
Save In Desired Format
Delete Template Copy
Each of the sections that you are replacing within your word document you have to insert a bookmark for that location (easiest way to input text in an area).
I always create a function to accomplish this, and I end up passing in the path - as well as all of the text to replace my in-document bookmarks. The function call can get long sometimes, but it works for me.
Application app = new Application();
Document doc = app.Documents.Open("sDocumentCopyPath.docx");
if (doc.Bookmarks.Exists("bookmark_1"))
{
object oBookMark = "bookmark_1";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text =
"My Text To Replace bookmark_1";
}
if (doc.Bookmarks.Exists("bookmark_2"))
{
object oBookMark = "bookmark_2";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text =
"My Text To Replace bookmark_2";
}
doc.ExportAsFixedFormat("myNewPdf.pdf", WdExportFormat.wdExportFormatPDF);
((_Document)doc).Close();
((_Application)app).Quit();
This code should get you up and running unless you want to pass in all the values into a function.
If you need some more explanation I can help as well :) my example saves it as a .pdf, but you can do any format you prefer.

Removing images in header with OpenXML SDK

I have worked a bit with OpenXML SDK, and made a POC of replacing images in a header in a word document. However, when I try to call DeletePart or DeleteParts with the images I want to remove, it doesn't go as expected.
When I open the word doc afterwards, where there before was an image, there now is a frame with the text "This image cannot currently be displayed" and a red cross.
From a bit of googling it appears as if the references have not been completely removed, but I can't find any help on how to do that..
Below is an example of how I delete images. I only add some of them to the list, because I need to remove all but the ones with a specific uri..
//...
foreach(HeaderPart headerPart in document.MainDocumentPart.HeaderParts) {
List<ImagePart> list = new List<ImagePart>();
List<ImagePart> imgParts = new List<ImagePart> (headerPart.ImageParts);
foreach(ImagePart headerImagePart in imgParts) {
string newUri = headerImagePart.Uri.ToString();
if(newUri != uri) {
list.Add(headerImagePart);
}
}
headerPart.DeleteParts(list);
}
//...

Images are made up of 2 parts in OpenXml; you have the actual image itself and you also have details of the Picture container that the image is displayed within in the document.
This makes sense if you think about an image being displayed more than once in the same document; details of the image can be stored once and the position(s) of the image can be stored as many times as needed.
The following code will find any Drawing objects that contain the ImagePart objects that you wish to delete. This is done by matching the Embed property of the Blip against the Id of the ImagePart.
using (WordprocessingDocument document = WordprocessingDocument.Open(filename, true))
{
foreach (HeaderPart headerPart in document.MainDocumentPart.HeaderParts)
{
List<ImagePart> list = new List<ImagePart>();
List<ImagePart> imgParts = new List<ImagePart>(headerPart.ImageParts);
List<Drawing> drwdDeleteParts = new List<Drawing>();
List<Drawing> drwParts = new List<Drawing>(headerPart.RootElement.Descendants<Drawing>());
foreach (ImagePart headerImagePart in imgParts)
{
string newUri = headerImagePart.Uri.ToString();
if (newUri != uri)
{
list.Add(headerImagePart);
//you also need to find the Drawings the image was related to
IEnumerable<Drawing> drawings = drwParts.Where(d => d.Descendants<Pic.Picture>().Any(p => p.BlipFill.Blip.Embed == headerPart.GetIdOfPart(headerImagePart)));
foreach (var drawing in drawings)
{
if (drawing != null && !drwdDeleteParts.Contains(drawing))
drwdDeleteParts.Add(drawing);
}
}
}
foreach (var d in drwdDeleteParts)
{
d.Remove();
}
headerPart.DeleteParts(list);
}
}
As you pointed out in the comments, you'll need to add a using statement:
Pic = DocumentFormat.OpenXml.Drawing.Pictures;

Read bookmarks in outlook MSG file with C#

My goal is to somehow be able to read bookmarks in an outlook .msg file, then replace them with a different text. I want to do this with C#.
I know how to access the body and change the text, but was wondering if there was a way to access directly the list of all the bookmarks and its location so that i can easily replace them, instead going through the whole body text, splitting it up, etc etc...
edit: this is how a bookmark window looks like from this window one can assign bookmarks, but it should be possible to obtain this list via c#.
Any relevant info is appreciated.
Thanks in advance.

Since Outlook most often uses Word as it's body editor - you need to add a project reference to Microsoft.Office.Interop.Word.dll and then access to the Outlook Inspector's WordEditor during the Inspector.Activate event. Once you have access to the Word.Document - it's trivial to load up the Bookmarks and access/modify their values.
Outlook.Inspector inspector = Globals.ThisAddIn.Application.ActiveInspector();
((Outlook.InspectorEvents_10_Event)inspector).Activate += () =>
{ // validation to ensure we are using Word Editor
if (inspector.EditorType == Outlook.OlEditorType.olEditorWord && inspector.IsWordMail())
{
Word.Document wordDoc = inspector.WordEditor as Word.Document;
if (wordDoc != null)
{
var bookmarks = wordDoc.Bookmarks;
foreach (Word.Bookmark item in bookmarks)
{
string name = item.Name; // bookmark name
Word.Range bookmarkRange = item.Range; // bookmark range
string bookmarkText = bookmarkRange.Text; // bookmark text
item.Select(); // triggers bookmark selection
}
}
}
};

how can I put a content in a mergefield in docx

I'm developing a web application with asp.net and I have a file called Template.docx that works like a template to generate other reports. Inside this Template.docx I have some MergeFields (Title, CustomerName, Content, Footer, etc) to replace for some dynamic content in C#.
I would like to know, how can I put a content in a mergefield in docx ?
I don't know if MergeFields is the right way to do this or if there is another way. If you can suggest me, I appreciate!
PS: I have openxml referenced in my web application.
Edits:
private MemoryStream LoadFileIntoStream(string fileName)
{
MemoryStream memoryStream = new MemoryStream();
using (FileStream fileStream = File.OpenRead(fileName))
{
memoryStream.SetLength(fileStream.Length);
fileStream.Read(memoryStream.GetBuffer(), 0, (int) fileStream.Length);
memoryStream.Flush();
fileStream.Close();
}
return memoryStream;
}
public MemoryStream GenerateWord()
{
string templateDoc = "C:\\temp\\template.docx";
string reportFileName = "C:\\temp\\result.docx";
var reportStream = LoadFileIntoStream(templateDoc);
// Copy a new file name from template file
//File.Copy(templateDoc, reportFileName, true);
// Open the new Package
Package pkg = Package.Open(reportStream, FileMode.Open, FileAccess.ReadWrite);
// Specify the URI of the part to be read
Uri uri = new Uri("/word/document.xml", UriKind.Relative);
PackagePart part = pkg.GetPart(uri);
XmlDocument xmlMainXMLDoc = new XmlDocument();
xmlMainXMLDoc.Load(part.GetStream(FileMode.Open, FileAccess.Read));
// replace some keys inside xml (it will come from database, it's just a test)
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_customer", "My Customer Name");
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_title", "Report of Documents");
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_content", "Content of Document");
// Open the stream to write document
StreamWriter partWrt = new StreamWriter(part.GetStream(FileMode.Open, FileAccess.Write));
//doc.Save(partWrt);
xmlMainXMLDoc.Save(partWrt);
partWrt.Flush();
partWrt.Close();
reportStream.Flush();
pkg.Close();
return reportStream;
}
PS: When I convert MemoryStream to a file, I got a corrupted file. Thanks!

I know this is an old post, but I could not get the accepted answer to work for me. The project linked would not even compile (which someone has already commented in that link). Also, it seems to use other Nuget packages like WPFToolkit.
So I'm adding my answer here in case someone finds it useful. This only uses the OpenXML SDK 2.5 and also the WindowsBase v4. This works on MS Word 2010 and later.
string sourceFile = #"C:\Template.docx";
string targetFile = #"C:\Result.docx";
File.Copy(sourceFile, targetFile, true);
using (WordprocessingDocument document = WordprocessingDocument.Open(targetFile, true))
{
// If your sourceFile is a different type (e.g., .DOTX), you will need to change the target type like so:
document.ChangeDocumentType(WordprocessingDocumentType.Document);
// Get the MainPart of the document
MainDocumentPart mainPart = document.MainDocumentPart;
var mergeFields = mainPart.RootElement.Descendants<FieldCode>();
var mergeFieldName = "SenderFullName";
var replacementText = "John Smith";
ReplaceMergeFieldWithText(mergeFields, mergeFieldName, replacementText);
// Save the document
mainPart.Document.Save();
}
private void ReplaceMergeFieldWithText(IEnumerable<FieldCode> fields, string mergeFieldName, string replacementText)
{
var field = fields
.Where(f => f.InnerText.Contains(mergeFieldName))
.FirstOrDefault();
if (field != null)
{
// Get the Run that contains our FieldCode
// Then get the parent container of this Run
Run rFldCode = (Run)field.Parent;
// Get the three (3) other Runs that make up our merge field
Run rBegin = rFldCode.PreviousSibling<Run>();
Run rSep = rFldCode.NextSibling<Run>();
Run rText = rSep.NextSibling<Run>();
Run rEnd = rText.NextSibling<Run>();
// Get the Run that holds the Text element for our merge field
// Get the Text element and replace the text content
Text t = rText.GetFirstChild<Text>();
t.Text = replacementText;
// Remove all the four (4) Runs for our merge field
rFldCode.Remove();
rBegin.Remove();
rSep.Remove();
rEnd.Remove();
}
}
What the code above does is basically this:
Identify the 4 Runs that make up the merge field named "SenderFullName".
Identify the Run that contains the Text element for our merge field.
Remove the 4 Runs.
Update the text property of the Text element for our merge field.
UPDATE
For anyone interested, here is a simple static class I used to help me with replacing merge fields.

Frank Fajardo's answer was 99% of the way there for me, but it is important to note that MERGEFIELDS can be SimpleFields or FieldCodes.
In the case of SimpleFields, the text runs displayed to the user in the document are children of the SimpleField.
In the case of FieldCodes, the text runs shown to the user are between the runs containing FieldChars with the Separate and the End FieldCharValues. Occasionally, several text containing runs exist between the Separate and End Elements.
The code below deals with these problems. Further details of how to get all the MERGEFIELDS from the document, including the header and footer is available in a GitHub repository at https://github.com/mcshaz/SimPlanner/blob/master/SP.DTOs/Utilities/OpenXmlExtensions.cs
private static Run CreateSimpleTextRun(string text)
{
Run returnVar = new Run();
RunProperties runProp = new RunProperties();
runProp.Append(new NoProof());
returnVar.Append(runProp);
returnVar.Append(new Text() { Text = text });
return returnVar;
}
private static void InsertMergeFieldText(OpenXmlElement field, string replacementText)
{
var sf = field as SimpleField;
if (sf != null)
{
var textChildren = sf.Descendants<Text>();
textChildren.First().Text = replacementText;
foreach (var others in textChildren.Skip(1))
{
others.Remove();
}
}
else
{
var runs = GetAssociatedRuns((FieldCode)field);
var rEnd = runs[runs.Count - 1];
foreach (var r in runs
.SkipWhile(r => !r.ContainsCharType(FieldCharValues.Separate))
.Skip(1)
.TakeWhile(r=>r!= rEnd))
{
r.Remove();
}
rEnd.InsertBeforeSelf(CreateSimpleTextRun(replacementText));
}
}
private static IList<Run> GetAssociatedRuns(FieldCode fieldCode)
{
Run rFieldCode = (Run)fieldCode.Parent;
Run rBegin = rFieldCode.PreviousSibling<Run>();
Run rCurrent = rFieldCode.NextSibling<Run>();
var runs = new List<Run>(new[] { rBegin, rCurrent });
while (!rCurrent.ContainsCharType(FieldCharValues.End))
{
rCurrent = rCurrent.NextSibling<Run>();
runs.Add(rCurrent);
};
return runs;
}
private static bool ContainsCharType(this Run run, FieldCharValues fieldCharType)
{
var fc = run.GetFirstChild<FieldChar>();
return fc == null
? false
: fc.FieldCharType.Value == fieldCharType;
}

You could try http://www.codeproject.com/KB/office/Fill_Mergefields.aspx which uses the Open XML SDK to do this.

OpenXml Sdk - Copy Sections of docx into another docx

I am trying the following code. It takes a fileName (docx file with many sections) and I try to iterate through each section getting the section name. The problem is that I end up with unreadable docx files. It does not error, but I think I am doing something wrong with getting the elements in the section.
public void Split(string fileName) {
using (WordprocessingDocument myDoc =
WordprocessingDocument.Open(fileName, true)) {
string curCliCode = "";
MainDocumentPart mdp = myDoc.MainDocumentPart;
foreach (var element in mdp.Document.Body.ChildElements) {
if (element.Descendants().OfType<SectionProperties>().Count() == 1) {
//get the name of the section from the footer
var footer = (FooterPart) mdp.GetPartById(
element.Descendants().OfType<SectionProperties>().First().OfType
<FooterReference>().First().
Id.Value);
foreach (Paragraph p in footer.Footer.ChildElements.OfType<Paragraph>()) {
if (p.InnerText != "") {
curCliCode = p.InnerText;
}
}
if (curCliCode != "") {
var forFile = new List<OpenXmlElement>();
var els = element.ElementsBefore();
if (els != null) {
foreach (var e in els) {
if (e != null) {
forFile.Add(e);
}
}
for (int i = 0; i < els.Count(); i++) {
els.ElementAt(i).Remove();
}
}
Create(curCliCode, forFile);
}
}
}
}
}
private void Create(string cliCode,IEnumerable<OpenXmlElement> docParts) {
var parts = from e in docParts select e.Clone();
const string template = #"\Test\toSplit\blank.docx";
string destination = string.Format(#"\Test\{0}.docx", cliCode);
File.Copy(template, destination,true);
/* Create the package and main document part */
using (WordprocessingDocument myDoc =
WordprocessingDocument.Open(destination, true)) {
MainDocumentPart mainPart = myDoc.MainDocumentPart;
/* Create the contents */
foreach(var part in parts) {
mainPart.Document.Body.Append((OpenXmlElement)part);
}
/* Save the results and close */
mainPart.Document.Save();
myDoc.Close();
}
}
Does anyone know what the problem could be (or how to properly copy a section from one document to another)?

I've done some work in this area, and what I have found invaluable is diffing a known good file with a prospective file; the error is usually fairly obvious.
What I would do is take a file that you know works, and copy all of the sections into the template. Theoretically, the two files should be identical. Run a diff on them the document.xml inside the docx file, and you'll see the difference.
BTW, I'm assuming that you know that the docx is actually a zip; change the extension to "zip", and you'll be able to get at the actual xml files which compose the format.
As far as diff tools, I use Beyond Compare from Scooter Software.

An approach along the lines of what you are doing will work only for simple documents (ie those not containing images, hyperlinks, comments etc). To handle these more complex documents, take a look at http://blogs.msdn.com/b/ericwhite/archive/2009/02/05/move-insert-delete-paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx and the resulting DocumentBuilder API (part of the PowerTools for Open XML project on CodePlex).
In order to split a docx into sections using DocumentBuilder, you'll still need to first find the index of the paragraphs containing sectPr elements.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Programatically filling content controls in Word document (OpenXML) in .NET - c#

You can use this code. foreach (SdtElement sdt in mainPart.Document.Descendants<SdtElement>()) { SdtAlias alias = sdt.Descendants<SdtAlias>().FirstOrDefault(); if (alias != null) { string sdtTitle = alias.Val.Value; Text t = sdt.Descendants<Text>().First(); t.Text = "test"; } }

Related

C# word document

Removing images in header with OpenXML SDK

Read bookmarks in outlook MSG file with C#

how can I put a content in a mergefield in docx

OpenXml Sdk - Copy Sections of docx into another docx

Categories

Resources