We're using aspose.word to create a report from a word template direct to PDF format. Now we're stuck at handling the TOC for the document. We need the TOC to be dynamic and make correct changes according to where the section is on the document.
Would aspose handle putting the correct page number to the correct category? And can anyone give a short example? The example on the aspose site wasn't helpful in this case.
We're using visual studio to make this site and it says on the aspose documentation that the setting for a TOC is set first on a regular word document then copy the field codes. So I'm approaching it this way, setting the TOC on the template itself then copying the field code to the C# code. Or am I over thinking it and all I need to do is make sure that the TOC is correct on the template itself.
I worked with Aspose for a bit, and as I recall, they don't really do any repagination, which is what would be necessary to properly regenerate a TOC with proper page numbers.
You may need to put an call into them. As I recall, their support was pretty easy to get in touch with. I didn't end up using their product because of some of the limitations, but for general use, it seemed pretty decent.
Related
Am evaluating the Aspose.words for one of my client, almost all of the feature i have migrated from MS Word library to Aspose.word library. Just one more to go, but am struggling to find the solution for the below:
We have Template document which is in .docx format. Template has a Two column page layout. at run time system would copy paste the content from other document to this Template document. still this steps works fine.
When i open the template page it looks good with 2 column layout.
But we have some logic that should read the last line of First Column & checks whether the text is in specific format, if it is then moves one line down which would automaticaly moves to the next column.
This logic is easily acheivable in Word but i couldn't find any refference in Aspose.words to implement this.
Also i tried to find different option by convering the document to Xml. & found that there is one node called . but this node is visble only when i save the document as xml From Microsoft word. Not occurs if i save the document as xml from Aspose.words.
Please advice me to solve this issue.
Thanks in advance
Gunasekara S
we just have finished integrating a feature into Aspose.Words to open up access to the rendering engine so that each element of the rendered document can be read as it appears as pages, columns, lines, spans etc. This functionality is exactly what you need and will be available in the next version of Aspose.Words which is expected to release in about a week's time. Soon I will be sharing the code snippet to accomplish your requirement.
My name is Nayyer and I am developer evangelist at Aspose.
I've been searching on ways to create letters using a C# Windows Form application in Visual Studio based on information from a local SQL server. I've seen some other topics but each answer seems to be really different.
My knowledge is pretty basic about this and wouldn't mind a step in the right direction. Is it actually possible? is there a better solution rather than creating word documents?
I only really need to be able to create a word document which has some text and and tables is this possible?
If you are creating a document of same kind with different data then use a Word template (.dotx) and use content controls or bookmarks in the document.
Advantage: Saves your time in manipulating with formatting and alignment in the document
Then use Open XML to just replace your content controls or bookmarks with the values.
Advantage: You dont need Word Interop assemblies to be deployed. Faster and recommended by Microsoft.
Depending on your needs, you could either simply copy a pre-existing file, or if you need to modify the document from code, you can use Microsoft Office Automation. (see C# office 2010 automation)
I'm starting to wonder if this is even possible. I've searched for solutions on Google and come up with nothing that works exactly how I'd like it to.
I think it'd benefit to explain what that entails. I work for database group at my university's IT department. My main job is to take specs of a report in a docx file, copy that over to dreamweaver, fix some formatting, and put it onto their website. My issue is that it's ridiculously tedious to do this over and over. I figured, hey, I haven't written anything in C# for some time now, perhaps I could write an application to grab a docx file, convert it to HTML, fix the CSS, stick the header, and footer from the webpage on there, and save the result. I originally planned to have it do one by one, but it probably wouldn't be difficult to have it input a list of files and batch convert.
I've found these relevant topics on how to accomplish this, but they don't fit my needs well enough.
http://www.techrepublic.com/blog/howdoi/how-do-i-modify-word-documents-using-c/190
This is probably fine for a few documents, but since it's just automating an instance of Word, I feel like it'd be slow and memory intensive. I'd prefer to avoid opening and closing an instance of Word 50+ times.
http://openxmldeveloper.org/articles/333.aspx
This is what I started using. XSLT had the benefit of not needing word to be installed nor ran for each file. After some searching I got a proof of concept working. It takes in a docx file, decompresses it, grabs the document.xml from that, and uses the DocX2Html.xsl file I scavenged from OpenXML viewer. I believe that was originally provided by MS for sharepoint servers to provide the ability to render word documents in a browser. Or something along those lines.
After adjusting that code to fit my needs, and having issues with the objXSLT.Load () method, I ended up using IlMerge to make the XSL into a DLL. No idea why I kept getting a compile error when using the plain old XSL file, but the DLL worked fine, so I was satisfied. Here (http://pastebin.com/a5HBAakJ) is my current code. It does the job of converting docx to HTML just fine (other than random spaces between some words), but the result file has ridiculously ugly HTML syntax. An example of this monstrosity can be found here (http://pastebin.com/b8sPGmFE).
Does anyone know how I could remedy this? I'm thinking perhaps I need to make a new XSL file, as the one MS provided is what's responsible for sticking all those tags and extra code in there. My issue with that is that I don't know anything about how to do that. Perhaps there's an alternative version already out there. All I'd need is one that will preserve tables and text formatting. Images aren't needed.
This looks like just what you need: http://msdn.microsoft.com/en-us/library/ff628051(v=office.14).aspx
The author Eric White blogged about his experiences developing that tool. You can see that list of posts on his blog here: http://blogs.msdn.com/b/ericwhite/archive/2008/10/20/eric-white-s-blog-s-table-of-contents.aspx#Open_XML_to_XHtml
Since I'm a big fan of Aspose.Words, a commercial library to create/process Word documents, I would do something like:
Open the Word document with Aspose.Words.
Save the Word document as HTML.
Use something like SgmlReader or HTML Agility Pack (or even Regular Expressions if it is suitable) to remove unwanted HTML tags/attributes.
Since you wrote you work at an university, I'm not sure whether commercial packages are an option, though.
Hi not sure what the rules are on promoting your own solutions, so do let me know if I am out of line.
I am a web developer who had the same issues, so I created my own tool:
http://www.convertwordtohtml.com
We are also working on a new version that will have even better conversion quality and one click conversion eg you can right click on a word file and it will be directly converted to html and the code placed into the clipboard. The current version also supports command line access and the new version will have a server version to.
There is a free trial version downloadable from the site , and if you have any questions do contact me any time.
I have an existing XPS file that I would like to use as a template and possibly bind data to it. I have tried several methods, but cannot seem to get it to work.
Does anyone have any experience altering an existing XPS file to add data at runtime and then print or save?
Any help is appreciated.
XPS documents conform to the Open XML standard. There is an SDK for working with these docs. Here is a How-to article by Beth Massi: "Accessing Open XML Document Parts with the Open XML SDK".
Since you are working with the internal doc structure you might also check out 'Open XML Package Editor" which lets you explore the doc with Visual Studio. Here is another How-to by Beth Massi: "Handy Visual Studio Add-In to View Office 2007 Files".
+tom
it's a bit of a challenge to do this with XPS, but it is possible.
You can do this with our NiXPS SDK.
I've posted an example on my blog a while ago:
XPS variable data example
Regards,
Nick
Bindings are evaluated during the process of writing to an XPS document. So you can't set up a {Binding} in a FixedDocument, Write that FD to an XpsDocument, and expect to get that original FD back again when you next open that saved doc.
Also, the standard XpsWriter does convert everything into Glyphs on canvases, so you can't, say, a textbox in the original and expect to be able to find it after its been saved to a document.
I've never used the NiXPS libraries, so if Nick says it can be done you might want to check it out.
One last possibility--You can create placeholders in a form that you will be able to find later. They'd have to be text (something like [[{{FORMFIELDHERELOL}}]]) with some kind of delimiter scheme to differentiate the text from everything else. You could then go spelunking in the XML looking for text that fits the delimeter pattern and switch out those glyphs for your binding text. Of course, the issue with THAT is that if you aren't putting X chars in place of X chars you might find you have to do some repositioning. As its all glyphs on canvas this might be slightly harder than, say, threading a needle with a shoelace.
I have been searching for several hours but i couldn't find anything about this... Basically I would like to create a template or plug-in for word 2007 that would allow someone to create new pages for a CMS. What I have in mind is something similar to blog post template. I know how to create a basic template but I can't find a way to publish the created document using a publish button inside the Word.
thnx in advance
I understand what you are trying to achieve, but Word is the wrong starting point. I would start with a much more basic text editor.
Word is horrible, horrible, horrible. Your site will define clear styles, yet Word will output nasty HTML that won't match your website's CSS definitions.
Your best bet therefore is to have a means to drop the Word file into the site, and have code programmatically analyse it and transform it into site-valid HTML. In Java you could use Apache POI, but that's very raw still. Might be a lot easier in a Microsoft centric world.
Far better, in my opinion, is to force people to learn Markdown, or BBCode, or HTML, or to use a Styled HTML Editor in your CMS - cut and paste plain text in, then style with the CMS defined styles.
As you are using Word 2007 you can export the document as XML and then use XSLT to generate the HTML.
If your CMS has an API or import facility you could convert the output from Word to suit that interface.
You can write a Word macro to add a Publish button/menu option to Word that will generate the correct output.
It's not a bad idea since it's all about the end user. If Word produces bad HTML, you should just make it semantic correct before posting it to the CMS.
I've never done this but I'm sure that it's possible to with .NET via the "Word 2007 Addin"-template (assuming Office 2007).
Good luck!
You can do what you want if you use SharePoint 2007 as your CMS. You can set up a blog on SharePoint 2007 and post to the blog from Word. If you use Office 2007 on the client end then you will get some nice buttons like "post to my blog" etc.
If you can't use SharePoint or are talking about an existing CMS, you have a lot of hurdles to jump through. This is a major undertaking and not something you can get a simple answer out of Stack Overflow.
Have you considered using one of the freely available Javascript WYSIWYG Editors such as TinyMCE http://tinymce.moxiecode.com/? When configured with all the options, it has an impressive amount of functionality and the interface is very similar to Word. I realize this doesn't directly answer your question, but as others have pointed out starting from Word is going to be difficult.
I've been on a team that wrote a Word addin for a custom CMS system. It was written in VB6 and was able to take a Word document and turn basic formatting information - lists, bold, italic and even tables into HTML, which was uploaded to the server. It didn't create new pages or manage the site in the addin though.
I would definitely avoid choosing Word as the editor for your CMS from my experience. The biggest issue is each time you want to update the addin you have to redistribute it to the company or companies using it. You can do this is as an IE active-x control but it's far easier just to handicap the user to a limited set of styling options via a Javascript editor.
Word does have a powerful API for manipulating your content with, however we needed to disable so many options in Word to avoid unwanted fonts and so on, it resembled Wordpad more than Word in the end.
If it's a greenfield project and you have the time, I would infact recommend using Silverlight 4.0 over a Javascript editor. Version 4.0 has a richtextbox control built in, plus there is also the excellent Vectorlight one.
May be it helps you, umbraco CMS allow editing with Microsoft Word.
For some reason this is a feature that Excel enjoys but not Word.
Excel can can automatically publish an HTML file version of your document when you save it.
Unfortunately Word seems to only be able to achieve this functionality when using Sharepoint, which is a shame because it can be quite useful.
What you can do, short of creating your own add-in is to add a bit of code to your template to create a HTML copy of your document whenever the user saves it.
First, make sure your template is macro-enabled (saved as .dotm file).
Second, while editing the template in Word, open the VBA code editor (ALT-F11)
In the project list double-click on your document to open its code-behind file.
Add the following bit of code to it, modifying the ActiveDocument.SaveAs path to something more appropriate to you, like a shared network folder where your CMS exposed by your CMS.
Sub FileSave()
' First Save the main document
ActiveDocument.Save
' Now we create a new document based on the current one
Selection.WholeStory
Selection.Copy
Documents.Add
Selection.PasteAndFormat wdPasteDefault
' Save it as HTML and close it
ActiveDocument.SaveAs "c:\temp\mydoc.html", fileformat:=wdFormatHTML
ActiveDocument.Close
End Sub
This will copy the original file into a blank new one that will be saved to HTML and closed before returning to the original file.
You can check some of the options to the Documents.Add if you want to use a different template than the normal one.
Security
because this template contains macros, you will have to install it with the other templates where Word expect them.
If you don't, then you'll get a security warning.
To avoid getting it, you can add the path where your templates are located to the list of Trusted Locations under Word's Options > Trust Center > Trust Center Settings > Trusted Locations.