I'm looking for a C# syntax highlighter that will take my C# code and turn it into standalone HTML that is neatly tagged. I have found some websites that offer this but only output HTML that is coupled with a CSS stylesheet. If anyone knows if what I'm describing exists please gimme a link!
I'm using Copy Code in HTML format with Visual Studio 2010 as part of Visual Studio 2010 Productivity Power Tools
The Python tool Pygments looks like it can do what you want and much more besides. I believe it supports many languages, including C#.
a wide range of common languages and markup formats is supported
special attention is paid to details that increase highlighting quality
support for new languages and formats are added easily; most
languages use a simple regex-based
lexing mechanism
a number of output formats is available, among them HTML, RTF,
LaTeX and ANSI sequences
it is usable as a command-line tool and as a library
When I last used it, I got standalone HTML from the CopySourceAsHtml plugin to Visual Studio. (I stopped using it only because I prefer the linked CSS approach.)
http://copysourceashtml.codeplex.com/
The appeal of this plugin is that it will match the styling of your Visual Studio theme, whatever that may be.
Related
is there a way to compile a LaTeX document with C#?
I want to program a standalone Windows application, that gives you a pdf-file without installing any other programs like miktex/etc.
Thanks in advance.
Using LaTeX is similar to using a programming language or markup language, like HTML. You'll have to reimplement everything from LaTeX that you want to support in your C# program. This might be possible for a very small subset of the LaTeX standard, but you'll be reinventing the wheel. LaTeX is the most extensive markup language around with an enormous feature set with decades of development, it's not feasible to create a C# converter out of the blue for it.
To put it bluntly, it would be probably easier for you to create your own C# compiler than to come up with a feature complete LaTeX compiler.
If you can't change your input data, i.e. a LaTeX source, then you should use one of the existing LaTeX converters. If you're looking for a way to convert text source files with some markup data to PDF with C# then better look into a lightweight markup language like Markdown.
Is there a markup language that can be used in conjunction with a well supported .net open source project to generate PDF or HTML documents with very fine control on the output in terms of style and anchoring for both ?
Documents will part be static and part auto generated from the xml comments of some class libraries.
To Clarify the question, I Know html is a markup language, The reason I don't want to use it to directly store the content is because all of the HTML to PDF tools and libraries I have looked at contain patchy support for creating tables of contents, indexes and turning hyperlinks in to PDF document anchors.
I would opt for HTML documents. Markdown comes to mind. But as far as 'very fine' control goes arbitrarily, you can always just use HTML.. it is THE HyperText Markup Language after all.
There were many questions like this before on stackoverflow. I think the consensus is that you should have one markup language, rather than two.
HTML is - by definition (hypertext MARKUP LANGUAGE) - the markup language of choice and all you need to do is convert that to PDF. The other way around, from PDF to HTML is quite a bit tougher.
In order to convert HTML to PDF there's a truckload of tools, depending on what exact needs you have for the resulting PDF and what kind of CSS you need to support.
I'd always go for a rendering engine that's used in browsers (instead of something like iText or Prince), because you want to make sure your docs look like they do in a browser. You'd end up with Winnovative or something based on WebKit like the API by htm2pdf.
XSL-FO is the recommended solution. It provides a great level of control over the document layout and there are several tools for XSL-FO to PDF comversion.
I was searching the web with a few results, but none of them seems to fit the task. I was looking für possibilites for .NET, but would also like to know how Java/PHP/etc. developers finish tasks like this.
As far as I found out, I have the option to:
Use MigraDoc/PDFSharp and go the "code" way, without any visual designer
I could use HTML and convert it to a PDF (which is the best approach in theory, but practically it's awful to get good looking HTML 1:1 into a PDF file)
I could use some weird MS Word templateing/batch stuff
LaTeX?
What are your solutions?
We use SoftArtisans OfficeWriter
A solution that we settled on in a previous project was XSL-FO. Although it did not have a visual designer, we found it to be very developer friendly and more suitable to run in a server type environment. It also deals with document "flow" a lot better than most of the reporting software that offer a designer. I do know that we had a lot of trouble with Crystal Reports around deployment, COM exceptions being thrown and limitations on how many reports can be generated concurrently. One downside to using XSL-FO is all the syntactic sugar that comes with XML.
This question lists a few XSL-FO engines.
Regarding your "3.) weird MS Word templateing/batch stuff":
I love to use Aspose.Words, a commercial package to create/edit/export Microsoft Office Word documents, without any Office components being installed.
Aspose.Words is capable of doing Mail Merge stuff and write PDF files, so I often start on my desktop computer with a DOC that I edit in Word and use this with Aspose.Words on my server to produce PDFs.
One method I've used before for Windows desktop applications is to use XAML/WPF. The nice thing about this solution is that there are a lot of good tools and documentation around building layouts with XAML. Then you just pass the canvas to a PrintDialog and you're done. If you've been doing a lot with WPF/XAML already this is a very easy solution and I've had a lot of success with it. I learned most of what I needed to get started here: http://www.switchonthecode.com/tutorials/printing-in-wpf
The downside, of course, is your dependency then on .NET and WPF.
Similar to Matt Fs solution of using Crystal Reports, I use SQL Server Reporting Services. You can create add a rdlc file to your solution and use the WYSIWYG editor to design your report. Then in your code, all you have to do it assign your data source to your report in code and it should work. This even supports exporting to PDF.
Seems as no-one has mentioned Latex-based solutions, there was a stack overflow Tex question answered by jason. Short version: uses MikTex, beautiful documents, big hassle to use build/maintain.
Thanks for all your answers...
I finally decided to implement my own solution using Visual Studio 2010 and the Office-Tools... This is not the "perfect" solution, but it was easy & fast to implement, while i still have the flexibility to change the documents witch excel or word...
Downside of course: You need Office installed.
It depends on how you get your template documents. For example, if you have others in your organization responsible for generating the "standard" invoice document, you'll probably have a solution that involves mail merges in the Microsoft Word API, because you need to work with Word-formatted input files. Alternately, if you are merely given the specs for the appearance of the document ("Logo in the top-right, 5 inches down, then a horizontal line two inches below that, then... etc.") You could do it entirely in code. Even if you're designing a solution from scratch, take into account who your document suppliers WILL be, and plan accordingly. Finally, if this is from-scratch for a small set of documents that won't change much (i.e., you're starting your own software company and want to send invoices) don't do it. Just buy something off the shelf or use Word :)
We use xaml FixedPage, can use a designer like Kazaml, it has a lot of layout flexablity, and databinding works great with dynamic objects like expando. In code we bind a datacontext and then render that to XPS, since we need the final output to be pdf we use GhostXPS which is free but has to be executed in a separate processs, there are third party fully managed converters for xps to pdf though.
We use Crystal Reports which comes free with Visual Studio. You can easily create a report/document that is bound to a database or unbound.
For example you could suppress the header and footer, expand the details section to be approx. A4 size, then add either bound fields or use parameters for unbound content. Then at runtime for bound documents set the selection formula to only pull in data for one transaction or for unbound documents just pass in the parameters.
A nice feature of Crystal Reports is there are export features, so export to PDF, Word, etc. Also it's easy to auto print to a specified printer.
Crystal reports can be a pain! On a basic level the outsourced developers for our in house software for Works Order, Invoices etc we use Dev Express although I think it can be pricey.
For reports being generated by the software I ended up choosing to have exports into a raw CSV which of course can be opened up by any spreadsheet software
I need to make an upload tool where in the Word document will be converted to HTML format for saving to database. Any idea?
I've written one (see the Doc to HTML Converter).
To implement it, I downloaded the PIAs for Word, which let me open a document using Word, and control the format in which Word then re-saves the document.
Alternatively (instead of doing it yourself) there are tools like mine (and others, more famous) which you can use (some of which don't even use Word).
I know this is an old post, but I just wrote an app that converts a Word-doc to a usable web-page. The app provides some of the requirements in the OP.
The app is WordWebNav (WWN). It's free and open-source.
WWN provides a Word VBA program that converts Word-docs to Word-HTML.
WWN also provides a Python program that converts the Word-HTML to a usable web-page:
It adds missing features to the Word-HTML, e.g., a navigation pane.
And, WWN fixes some common bugs in Word's HTML, e.g., mis-formatted lists, and overly-wide paragraphs.
The Python program uses a CLI, and it can be called externally.
If this is a client application and you have access to Word, why not automate Word? Word can save in HTML (although you will probably have to clean the HTML up a bit). However, I will warn you that this is not very portable; whoever is going to use application will need to have the same version of Word you developed it with.
I have been searching for several hours but i couldn't find anything about this... Basically I would like to create a template or plug-in for word 2007 that would allow someone to create new pages for a CMS. What I have in mind is something similar to blog post template. I know how to create a basic template but I can't find a way to publish the created document using a publish button inside the Word.
thnx in advance
I understand what you are trying to achieve, but Word is the wrong starting point. I would start with a much more basic text editor.
Word is horrible, horrible, horrible. Your site will define clear styles, yet Word will output nasty HTML that won't match your website's CSS definitions.
Your best bet therefore is to have a means to drop the Word file into the site, and have code programmatically analyse it and transform it into site-valid HTML. In Java you could use Apache POI, but that's very raw still. Might be a lot easier in a Microsoft centric world.
Far better, in my opinion, is to force people to learn Markdown, or BBCode, or HTML, or to use a Styled HTML Editor in your CMS - cut and paste plain text in, then style with the CMS defined styles.
As you are using Word 2007 you can export the document as XML and then use XSLT to generate the HTML.
If your CMS has an API or import facility you could convert the output from Word to suit that interface.
You can write a Word macro to add a Publish button/menu option to Word that will generate the correct output.
It's not a bad idea since it's all about the end user. If Word produces bad HTML, you should just make it semantic correct before posting it to the CMS.
I've never done this but I'm sure that it's possible to with .NET via the "Word 2007 Addin"-template (assuming Office 2007).
Good luck!
You can do what you want if you use SharePoint 2007 as your CMS. You can set up a blog on SharePoint 2007 and post to the blog from Word. If you use Office 2007 on the client end then you will get some nice buttons like "post to my blog" etc.
If you can't use SharePoint or are talking about an existing CMS, you have a lot of hurdles to jump through. This is a major undertaking and not something you can get a simple answer out of Stack Overflow.
Have you considered using one of the freely available Javascript WYSIWYG Editors such as TinyMCE http://tinymce.moxiecode.com/? When configured with all the options, it has an impressive amount of functionality and the interface is very similar to Word. I realize this doesn't directly answer your question, but as others have pointed out starting from Word is going to be difficult.
I've been on a team that wrote a Word addin for a custom CMS system. It was written in VB6 and was able to take a Word document and turn basic formatting information - lists, bold, italic and even tables into HTML, which was uploaded to the server. It didn't create new pages or manage the site in the addin though.
I would definitely avoid choosing Word as the editor for your CMS from my experience. The biggest issue is each time you want to update the addin you have to redistribute it to the company or companies using it. You can do this is as an IE active-x control but it's far easier just to handicap the user to a limited set of styling options via a Javascript editor.
Word does have a powerful API for manipulating your content with, however we needed to disable so many options in Word to avoid unwanted fonts and so on, it resembled Wordpad more than Word in the end.
If it's a greenfield project and you have the time, I would infact recommend using Silverlight 4.0 over a Javascript editor. Version 4.0 has a richtextbox control built in, plus there is also the excellent Vectorlight one.
May be it helps you, umbraco CMS allow editing with Microsoft Word.
For some reason this is a feature that Excel enjoys but not Word.
Excel can can automatically publish an HTML file version of your document when you save it.
Unfortunately Word seems to only be able to achieve this functionality when using Sharepoint, which is a shame because it can be quite useful.
What you can do, short of creating your own add-in is to add a bit of code to your template to create a HTML copy of your document whenever the user saves it.
First, make sure your template is macro-enabled (saved as .dotm file).
Second, while editing the template in Word, open the VBA code editor (ALT-F11)
In the project list double-click on your document to open its code-behind file.
Add the following bit of code to it, modifying the ActiveDocument.SaveAs path to something more appropriate to you, like a shared network folder where your CMS exposed by your CMS.
Sub FileSave()
' First Save the main document
ActiveDocument.Save
' Now we create a new document based on the current one
Selection.WholeStory
Selection.Copy
Documents.Add
Selection.PasteAndFormat wdPasteDefault
' Save it as HTML and close it
ActiveDocument.SaveAs "c:\temp\mydoc.html", fileformat:=wdFormatHTML
ActiveDocument.Close
End Sub
This will copy the original file into a blank new one that will be saved to HTML and closed before returning to the original file.
You can check some of the options to the Documents.Add if you want to use a different template than the normal one.
Security
because this template contains macros, you will have to install it with the other templates where Word expect them.
If you don't, then you'll get a security warning.
To avoid getting it, you can add the path where your templates are located to the list of Trusted Locations under Word's Options > Trust Center > Trust Center Settings > Trusted Locations.