I've created a large (~ 1000 pages) word document using the openxml-sdk. When i open it the first time using the word application, it shows "Word is renumbering the pages of test.docx" in the statusbar and does so for about 15 seconds. I've made a
german screenshot of this behaviour. After this step the document is changed and need to be saved. The new version of the file is about two times in size of the original one.
The document is saved the first time by simply calling
document.SaveAs("someFilePath");
What exactly is this behaviour? How can i renumber the pages in (or after) the creation of the document programmatically?
You can't renumber pages in a WordprocessingDocument, because those page numbers are not stored in the WordprocessingDocument but rather created while laying out, or rendering, the document.
A typical document defines page numbers as a complex field that you would find in the Open XML markup contained in the FooterPart (or possibly HeaderPart). Assuming the page number field is stored in the FooterPart, you might have a single FooterPart in your WordprocessingDocument in the simplest case (e.g., document with a single section). Even if you have multiple FooterParts, e.g., because you have multiple sections or you have a different layout for the first, odd, and even pages, you have relatively few FooterParts in your document (at least compared to your 1.000 pages).
When Word renders the document for printing or viewing, it also renders the page numbers based on your FooterParts (still using my example). For 1.000 pages, that takes time since Word is simply not built for documents that large.
Should you want to do Word's job and perform the layout yourself, you need to understand that building a layout engine is extremely complex and requires a lot of effort.
Related
I am using Active Reports in C#, and one option we present the users with is to print to a tray where the paper is White/Pink alternating.
Is there a printing method by which I can programatically cause each page to print twice, yet still collate correctly?
Edit:
My intended result is the following pattern:
Page one (white)
Page one (pink)
Page two (white)
Page two (pink)
Page one (white)
Page one (pink)
Page two (white)
Page two (pink)
Thus, each page is duplicated every time it prints.
Currently, I must disable collating and then print double the number of copies the user is asking for. However, the user must then manually assemble the documents.
Thanks for any help!
I see. So your report is by definition has to be duplicated on White and Pink and this duplication needs to be repeated based on users number of copies.
If you are using Page Reports you can design two page templates one for each "page color", use a master page to share the design elements of the page. You can also control that the pink pages is not visible in the viewer and are print only.
If you are using Section Reports, you would have to manage the duplication manually in your code. The Document class has a pages collection that you can manipulate, copying the page and inserting it into another location. Before printing you would need to copy p1 and p2 and insert them at the end, your report would now have four pages p1W, p1P, p2W, p2P. if the user prints multiple copies with collation on, everything should come out OK.
http://arhelp.grapecity.com/webhelp/AR10/index.html#GrapeCity.ActiveReports.Document.v10~GrapeCity.ActiveReports.Document.Section.PagesCollection~Add.html
hope this helps.
http://activereports.grapecity.com
Is there any way I can compare a word document(.docx) with a document template(.dotx) generated in microsoft word.
I want to do this comparison programmatically using c#.
I want to compare both documents word to word so that I can determine to which template the document belongs. I don't just want to compare the size of both but I want to compare the contents also.
By this comparison I want get the following results.
From which document template the document is generated.
In the document template, I want to check that at which place a particular information is stored.
Say for example I want to search for the communication information of a person, then I want to traverse the document and check that At which position the template has the area/section for Address.(i.e. Top left corner, top center, In a paragraph, In body etc)
In same way I want to extract other information too, Like Link to other documents etc.
After getting those positions I want to get that Information from the .Docx file.
Say, If I found that the Address in the top-left and there are five links referring to other documents in five different paragraphs. Then what I want is to get the Address and save it to a variable. After that I want to replace those link contents from placeholders to Actual hyperLinks. i.e If a Link is referring to Doc-A then Instead of just showing a Plain text I want replace it with A hyperlink to Doc-A.
Any suggestions?
Thank You.
Your question is rather too vague and involved to give a really good answer, however...
To find out from which template a document was generated the object model provides the property: Document.AttachedTemplate with will return the full file name. This is certainly better than comparing word-by-word (which is also very time-consuming)
The Word object model also provides the method CompareDocuments (belongs to the Word.Application class). This will "highlight" differences in the text content of two documents.
Links will be found in the Document.Hyperlinks collection
Getting the position of things is a bit chancy with Word and it depends on what you really mean by "top-left", etc. Better would be to construct the templates using content controls, form fields and/or bookmarks so that you can uniquely identify important sections. However, Word does provide the Range.get_Information method that can return relative and absolute positions on the page if that's what you really want.
Hello I have the following report that is generated on CR.
I have the same file in a couple of sites, but each returns a different pdf despite sending the same data:
Example 1:
Example 2 :
As you can see, in the second example, once one of the tickets finishes, another one starts after it on the very bottom, which makes it incomplete.
In the first example, it automatically sends it to the next page once it finds that the next one won't fit in the current page.
I can't seem to remember what I did to fix that, nor I understand why do I get different results in two different sites using the same rpt file with the same provided data.
Check new page after option im the section expert of the group so that new group starts on a new page
You say "I get different results in two different sites", means 2 different browser?
The output of same file will always comes same, first thing. Second is you just give your page margin of header- footer much more from the ticket size, so give always same result.
It may be some height issue at execution time, it give different result. So safer side , give as much as height and low-margin of page by report option -> page size.
I'm generating an MS Word document from user data. The data is placed in a container which is serialized to XML, and the resulting XML is converted to OpenXML using XSLT. There are a few minor changes done programmatically in C# to generate the Word document, as they can't be done with XSLT.
There is a user requirement that an item be placed completely on one page without any associated data being split onto another page. Sometimes one item will fill up an entire page, and sometimes I can fit three or four items on one page (I need to insert a separator (horizontal rule) between items that fit on the same page.)
Is there a way to determine whether or not one item or OpenXML paragraph will fit entirely on the "current" page? This can be either via C# or XSLT, and I can work something out.
Unfortunately, the only way this can be reliably done is to actually render the output, including all of the font sizes, bolding, kerning and all that. Which means you have to do the pagination in Word, and then save it back to the OpenXML.
I have successfully generated a word document file using open XML, but I have got too many blank pages,
how can i remove them ?
This depends on how those blank pages are represented in the Open XML; you may want to post a sample document to demonstrate exactly how your blank pages are represented.
But let's take the case of a Word document in which a user has inserted extra page breaks (by hitting ctrl-enter in Word), resulting in blank pages. These page breaks will be represented in the XML as:
<w:br w:type="page"/>
The page will still have plenty of tags in it for spacing, fonts, etc.; and the page may display header and footers, too. But let's define a blank page as one which has no new paragraph text. In Open XML, new text is displayed with a w:t tag.
So, in order to remove blank pages created by extra page breaks with no text in between, we can run the following regular expression on the XML document, replacing with blank (""):
<w:br w:type="page"/>(.(?!<w:t>))*(?=<w:br w:type="page"/>)
This regex will search for a series of two or more page breaks with no new text in between, removing all but the last one.
(Note that this won't take care of blank pages at the end of the document, which is a bit trickier. Additionally, if you'd like to account for pages with images, textboxes, etc., the regex will have to be expanded to include the relevant items).