Get particular page from Word document using Open XML SDK - c#

I want to convert each page of document into separate word document. So i need to get every page of document. I am not able to differentiate pages in open xml format.
So please move me to right direction.
using (WordprocessingDocument document = WordprocessingDocument.Open("test.docx", true))
{
MainDocumentPart mainPart = document.MainDocumentPart;
}

Based on the documentation here, The client uses LastRenderedPageBreak toidentify pages when its last saved. and the xml for it is:
<w:lastRenderedPageBreak/>
I think you can use this to check and break pages unless the document you are working with is auto generated and haven't got any lastRenderedPageBreaks.
Also this approach will only work for documents with single column layouts. But with documents with multi column layouts looks like there are issues.

Related

How to apply a multi-document template

The general process
Documents are uploaded to DocuSign creating a new Envelope.
Templates are applied to this Envelope.
Recipients are updated to make sure there is no mixup with signers from the template
The Envelope is sent to the signers.
Intended use of templates
The primary use of templates is to allow users to upload documents and use all other information from templates like fields and other setting for the uploaded documents. Signers can also be freely set and overwrite the ones defined in the template.
For applying templates we use https://developers.docusign.com/docs/esign-rest-api/reference/envelopes/envelopetemplates/applytodocument/
The Template
The template consist of 2 documents. The first document has 3 pages and the second document has 2 pages.
There is one signer. A signature box is added to the last page of both documents.
The Problem
Test scenario: The exact same documents as in the template. This results in the signature box on the second document is not set.
Test scenario: Use the 3 page file for both documents. This results in the signature box being put on page 3 of both documents. So it looks like the assignment works only from document 1 to all the other documents
Test scenario: Use different 3 page documents that match the template for both documents. The result is the same as in the 2. test scenario.
What I tried
The described scenarios are based on this (simplyfied) code for applying the templates:
//variables envAPI (class EnvelopesApi), accountId, templateId and envelope (class Envelope) are already set
DocumentTemplateList templateList = new DocumentTemplateList();
templateList.DocumentTemplates = new List<DocumentTemplate>();
templateList.DocumentTemplates.Add(new DocumentTemplate()
{
TemplateId = templateId,
DocumentId = "1"
});
envAPI.ApplyTemplateToDocument(accountId, envelope.EnvelopeId, "1", templateList);
templateList = new DocumentTemplateList();
templateList.DocumentTemplates = new List<DocumentTemplate>();
templateList.DocumentTemplates.Add(new DocumentTemplate()
{
TemplateId = templateId,
DocumentId = "2"
});
envAPI.ApplyTemplateToDocument(accountId, envelope.EnvelopeId, "2", templateList);
//some recipient checking is done here
envAPI.Update(accountId, envelope.EnvelopeId, envelope);
I also tried using more entries in templateList.DocumentTemplates but that only caused INVALID_REQUEST_BODY errors.
I realized the DocumentID property of the documents in the templates are very different after the first file. My test template has 1 for the first document and the ID of the second document is a very large number. Using this large number also causes INVALID_REQUEST_BODY errors.
Is this actually correct and the error is somewhere else? Because it looks like it should work this way. Or is the problem located somewhere else?
The workflow that you are trying is a perfect place to use a composite template. The recommendation is use one composite template per document. In one POST envelopes call you can swap out the documents on the templates with documents at runtime and apply server (saved) templates to the documents. You can add recipients, drop recipients by not including them, and also add tabs or write values to tabs at runtime. See:
https://www.docusign.com/blog/dsdev-from-the-trenches-composite-templates
https://www.docusign.com/blog/dsdev-why-use-composite-templates
To answer my own question: This doesn't work. At least not this way.
Yes, composite templates are one way of dealing with this requirement but then again, why do multi-document templates even exist?
So the answer is quite simple: Do it the other way around.
To use a multi-document template properly, the template needs to be the starting point. Use a template and replace the files to create an evenlope instead of creating an envelope first and then applying templates to the envelope.

Identifying table format in word document using C#

I am reading word document line by line in C# using Microsoft.Office.Interop.Word.
The doc has both paragraphs and tables. I want to check when the table occurs and get the entire contents of the table, else carry on with the line by line processing using doc.Paragraphs().
Any help to identify the table in word doc is greatly appreciated.
This seems to be a duplicate question. Please take a look here: "How to read MS Word paragraph and table content line by line". If you are not stuck to Microsoft.Office.Interop.Word give DocX a try and see this question: "Novacode Determine If Word Style Is A Table".

Delete content inside Word Bookmark in c#

I'm working on an ASP C# project which has to hide sections of a bookmarked Word document that includes plenty of tables inside. The first approach we had consisted on deleting the parent element of each tag, which works but requires tagging lines that have no data in them and leaves the previous space open in the final document(as in, if a page has two sections and you delete the one above, you get half whitespace and the section below).
So what we're trying now is adding a bookmark for each document section and deleting it altogether. So far what we've tried to accomplish this (taken from How to delete all contents inside a word document bookmark using C# and Open XML) is
BookmarkStart myBookMarkStart = OpenWordDocument.MainDocumentPart.Document.Body.Descendants<BookmarkStart>().FirstOrDefault(x => x.Name == myBookMarkName);
OpenXmlElement sibling = myBookMarkStart.NextSibling();
while (!(sibling is BookmarkEnd))
{
var temp = sibling;
sibling = sibling.NextSibling();
temp.Remove();
}
However, this doesn't remove anything inside the bookmark start and its end.

Inserting a Template into a Template - C# Open XML SDK 2.0/2.5

I'm working on some code to manipulate bookmarks in a preexisting .DOTX template file. For this issue, some of the bookmarks are intended to point to another .DOTX file and insert it into the current document.
I'm having trouble finding a way to do this without some heavy manipulation and digging through each element in the 2nd template and creating a similar element in the current document.
Anyone have any ideas of a way to do this easily?
Turned out to be easier than I thought.
foreach (BookmarkStart bookmark in mainDoc.RootElement.Descendants<BookmarkStart>().Where(b => String.Equals(b.Name, bookmarkName)))
{
var parent = bookmark.Parent;
using (WordprocessingDocument newTemplate = WordprocessingDocument.Open(template2, false))
{
var newTemplateBody = newTemplate.MainDocumentPart.Document.Body;
foreach (var element in newTemplateBody.Elements().Reverse<OpenXmlElement>())
{
parent.InsertAfterSelf<OpenXmlElement>((OpenXmlElement)element.Clone());
}
}
}
I apparently was doing everything right, however I was inserting the template in a paragraph. The template is a table, which cannot be nested in a paragraph. This was what was actually breaking my document.

Saving custom settings or attributes in a Word document

I've got a MS Word project where I'm building a number of Panes for users to complete some info which automatically populates text at bookmarks throughout the document. I'm just trying to find the best way of saving these values somehow that I can retrieve them easily when re-opening the document after users have typed in their values.
I could just try to retrieve them from the bookmarks themselves but of course in many cases they contain text values when I'd ideally want to store a primary key somewhere that's not visible to the user and just in case they made changes to the text which would make reverse engineering the values impossible.
I can't seem to find any information on saving custom attributes in a Word document, so would really appreciate some general guidance of how this might be achieved.
Thanks a lot!
I would suggest the use of custom document properties. there you can strings in a key -value manner (at least if it is similar to excel).
I found a thread which explains how to do it:
Set custom document properties with Word interop
After playing around with this a fair bit this is my final code in case it helps someone else, I've found this format easier to understand and work with. It's all based on the referenced article by Christian:
using Office = Microsoft.Office.Core;
using Word = Microsoft.Office.Interop.Word;
using System.Reflection;
Office.DocumentProperties properties = (Office.DocumentProperties)Globals.ThisDocument.CustomDocumentProperties;
//Check if the property exists already
if (properties.Cast<Office.DocumentProperty>().Where(c => c.Name == "nameofproperty").Count() == 0)
{
//Then add the property and value
properties.Add("nameofproperty", false, Office.MsoDocProperties.msoPropertyTypeString, "yourvalue");
}
else
{
//else just update the value
properties["nameofproperty"].Value = "yourvalue";
}
In terms of retrieving the value it's as easy as using the same three lines at the top to get the properties object, perhaps using the code in the if statement to check if it exists, and the retrieving it using properties["nameofproperty"].Value

Categories

Resources