I'm generating an MS Word document from user data. The data is placed in a container which is serialized to XML, and the resulting XML is converted to OpenXML using XSLT. There are a few minor changes done programmatically in C# to generate the Word document, as they can't be done with XSLT.
There is a user requirement that an item be placed completely on one page without any associated data being split onto another page. Sometimes one item will fill up an entire page, and sometimes I can fit three or four items on one page (I need to insert a separator (horizontal rule) between items that fit on the same page.)
Is there a way to determine whether or not one item or OpenXML paragraph will fit entirely on the "current" page? This can be either via C# or XSLT, and I can work something out.
Unfortunately, the only way this can be reliably done is to actually render the output, including all of the font sizes, bolding, kerning and all that. Which means you have to do the pagination in Word, and then save it back to the OpenXML.
Related
I've created a large (~ 1000 pages) word document using the openxml-sdk. When i open it the first time using the word application, it shows "Word is renumbering the pages of test.docx" in the statusbar and does so for about 15 seconds. I've made a
german screenshot of this behaviour. After this step the document is changed and need to be saved. The new version of the file is about two times in size of the original one.
The document is saved the first time by simply calling
document.SaveAs("someFilePath");
What exactly is this behaviour? How can i renumber the pages in (or after) the creation of the document programmatically?
You can't renumber pages in a WordprocessingDocument, because those page numbers are not stored in the WordprocessingDocument but rather created while laying out, or rendering, the document.
A typical document defines page numbers as a complex field that you would find in the Open XML markup contained in the FooterPart (or possibly HeaderPart). Assuming the page number field is stored in the FooterPart, you might have a single FooterPart in your WordprocessingDocument in the simplest case (e.g., document with a single section). Even if you have multiple FooterParts, e.g., because you have multiple sections or you have a different layout for the first, odd, and even pages, you have relatively few FooterParts in your document (at least compared to your 1.000 pages).
When Word renders the document for printing or viewing, it also renders the page numbers based on your FooterParts (still using my example). For 1.000 pages, that takes time since Word is simply not built for documents that large.
Should you want to do Word's job and perform the layout yourself, you need to understand that building a layout engine is extremely complex and requires a lot of effort.
Is there any way I can compare a word document(.docx) with a document template(.dotx) generated in microsoft word.
I want to do this comparison programmatically using c#.
I want to compare both documents word to word so that I can determine to which template the document belongs. I don't just want to compare the size of both but I want to compare the contents also.
By this comparison I want get the following results.
From which document template the document is generated.
In the document template, I want to check that at which place a particular information is stored.
Say for example I want to search for the communication information of a person, then I want to traverse the document and check that At which position the template has the area/section for Address.(i.e. Top left corner, top center, In a paragraph, In body etc)
In same way I want to extract other information too, Like Link to other documents etc.
After getting those positions I want to get that Information from the .Docx file.
Say, If I found that the Address in the top-left and there are five links referring to other documents in five different paragraphs. Then what I want is to get the Address and save it to a variable. After that I want to replace those link contents from placeholders to Actual hyperLinks. i.e If a Link is referring to Doc-A then Instead of just showing a Plain text I want replace it with A hyperlink to Doc-A.
Any suggestions?
Thank You.
Your question is rather too vague and involved to give a really good answer, however...
To find out from which template a document was generated the object model provides the property: Document.AttachedTemplate with will return the full file name. This is certainly better than comparing word-by-word (which is also very time-consuming)
The Word object model also provides the method CompareDocuments (belongs to the Word.Application class). This will "highlight" differences in the text content of two documents.
Links will be found in the Document.Hyperlinks collection
Getting the position of things is a bit chancy with Word and it depends on what you really mean by "top-left", etc. Better would be to construct the templates using content controls, form fields and/or bookmarks so that you can uniquely identify important sections. However, Word does provide the Range.get_Information method that can return relative and absolute positions on the page if that's what you really want.
I want to create a recipe application. I would like the input to look and feel like you are editing a document. If I were doing this in word, I'd create a template form for the user to use for the imput.
The form will look something like:
{Categoty} {Title}
{Image} {yield / nutrition info}
Ingredients
{bulleted list goes here}
Directions
{Numbered list goes here}
Notes / Comments
{Free form text goes here}
I tried doing this with a FlowDocument embedded im a RichTextBox, but could not figure it out. I can store the info and populate the FlowDocument parts easy enough, but I could not figure out how to control editing to force bullets or numbering at certain places / keeping the user from changing the format, etc.
Can this be done in a FlowDocument? If not, how can I create the bulleted / numbered list areas?
Flow Documents are editable as long as you use RichTextBox as opposed to Page.
you might want to take a look at this or this or even this
I ended up creating custom controls for the lists (custom grid supporting bulleted or numbered editable lists and using other controls for the various document parts to give me the control I want - I use an XML file for storing the pieces of the document and how to generate the FlowDocument (I hope this gives me the ease of updating the templates when I am asked to add something new)... I only generate a FlowDocument for printing purposes.
I did not get all the functionality I wanted, but I made it work. Now for my next project..
It is possible to export Microsoft Visio drawings as a Website containing Silverlight content. This is described on this blog-post.
The output of such an export are the following:
xaml_1.xaml (contains the structure of the control)
data.xml (contains all text content such as labels, etc)
several java-script files
*.htm pages with a Silverlight container
other files such as *.css and images
I would like to integrate the exported XAML code into another existing Silverlight application. I found this blog-post telling me how to load XAML code dynamically during runtime.
What I would like to know is how to "merge" the XAML-file and the data.xml and how I can get a reference to the items of the XAML code, in order to change certain texts...
In the associated xaml js file (eg xaml_1.js) there's a handleMouseUp function that reads the shape ID from the (XAML) 'name' string and then calls OnShapeClick in frameset.js. This method, which is common to all of the js-based Save as web output types, then calls other methods to populate the details table or retrieve hyperlinks found in data.xml. If you have a look at the FindShapeXML function in frameset.js you'll see that it gets the appropriate data based on the page and shape IDs (note that shape IDs are unique to a page as per Visio itself).
In terms of creating data-bound or dynamic shape text, one workaround for the glyphs issue that #slfan highlights is prevent the text from being output. For example, prior to running Save As Web in Visio, you could loop through all of the shapes and set their HideText ShapeSheet cell to true. This will prevent all of the glyphs xaml being generated and you'll still have access to the text string in data.xml. I guess you wouldn't then benefit from the correct font scaling, but it depends on your scenario. If it was really important to get the scale right then you could parse the RenderTransform attribute (which is described in attribute syntax rather than property element syntax) of the glyph elements.
Glyphs are there (I'm guessing) because it mirrors how Visio works in the application ie in Visio you can select individual characters within a shape's text and apply different fonts and formatting, but if you don't need that, I'd be tempted to ditch the glyphs collection and just use a TextBlock as #slfan suggests.
I think you have to tweak the generated XAML a little bit. Unfortunately Visio generates glyphs for every single character. If you want to change the text at runtime, you will have to remove this glyphs and add the required controls (e.g. TextBlock) yourself.
You can load the XAML into Silverlight with XamlReader.Load. A good description you find here: http://blogs.silverlight.net/blogs/msnow/archive/2008/10/09/silverlight-tip-of-the-day-60-how-to-load-a-control-straight-from-xaml.aspx.
All JavaScript and HTML files you can ignore, the XML-file you need to identify your controls. The ID's in the XML refer to the corresponding elements in the XAML-file.
We are developing C#.Net(4.0) Windows Form Based Application with the use of Open Xml Sdk(2.0) for manipulating MS-WORD Files.Now i want to get the all the paragraphs in particular page.The user prompted for getting particular page no of the word file to get the all the paragraphs inside the user selected page number. How i do it?
Taking a quick look at the underlying XML it doesn't look like there is an attribute on the paragraph element that will tell you which page it will appear on. The best suggestion I can give you is to have some placeholder text at the top and bottom of each page. Then search for the a certain instance of the placeholder text based on which page the user specifies. Once you have a starting point you could retrieve all paragraphs between the two placeholder paragraph elements.
For example, if a user enters in page two, you would search for the third instance of a paragraph that contains this placeholder text and then retrieve all paragraphs until you reach the next instance of the placeholder text. I know this isn't ideal, but its one workaround I could think of that might be feasible.