Convert HTML to PS with Ghostscript C#

Convert HTML to PS with Ghostscript C# - c#

I have problem when i print html file, i have tried doc,xls, and txt files and they work perfectly, but when i give the html file it shows me the print dialog and i have to select the ghostscript printer in order to work.
My code is:
[DllImport("Winspool.drv")]
private static extern bool SetDefaultPrinter(string printerName);
[ValidateInput(false)]
public ActionResult CreatePdf(string file , string html)
{
SetDefaultPrinter("Ghostscript");
Process process1 = new Process();
if (html != null && html != "")
{ process1.StartInfo.FileName = "example.html"; }
else
{ process1.StartInfo.FileName = file; }
process1.EnableRaisingEvents = true;
process1.StartInfo.Verb = "print";
process1.StartInfo.Arguments = "\"Ghostscript PDF\"";
process1.StartInfo.WorkingDirectory = Server.MapPath("~" + "/Export");
process1.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
process1.StartInfo.CreateNoWindow = true;
process1.Start();
try
{
process1.WaitForExit();
}
catch (InvalidOperationException) { }
process1.Dispose();
}
This should change my output.ps file,which then i use to make the pdf file,that works perfectly i just need to make this work for html file.
I followed this 2 examples:
Example 1
Example 2
Edit:
I needed this converstion in order to get pdf file from the html, and found that wkhtmltopdf suits me best.

Ghostscript does not convert (layout and render) HTML documents to PDF or PostScript, it is just a library for working with PostScript and PDF files, such as creating them from scratch and converting PostScript files to a raster format.
If you want to convert HTML to PDF your best bet is to use a commercial library like PrinceXML, or host WebKit.
When your code works, it works by getting Internet Explorer (or whatever your shell-default web-browser is) to do the rendering and printing itself. This technique will not reliably work in a server-side environment.

Related

How to speed up libreoffice conversion of docx to pdf with headless portable version

I currently have the portable version of libreoffice sitting in a folder and initiate the headless version to convert some docx files to pdf one by one. It takes about 20 seconds per pdf despite they are only about 9 pages. I am not sure if my code has a flaw or if there is some method to optimize this because the user experience is terrible waiting 20 to 30 seconds for each pdf to become available. I am also only doing this conversion to display the docx in a pdf viewer (pdf.js) as there is no free solutions to render docx files in the browser unless uploaded to 3rd party servers (ie, iframe google viewer, etc). My code is as follows:
public static void ConvertToPDF(string PathToItemToConvert, string PathToLibrePortable)
{
bool converted = false;
try
{
string fileName = Path.GetFileName(PathToItemToConvert);
string fileDir = Path.GetDirectoryName(PathToItemToConvert);
var pdfProcess = new Process();
pdfProcess.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
pdfProcess.StartInfo.FileName = PathToLibrePortable;
pdfProcess.StartInfo.Arguments =
String.Format("--norestore --nofirststartwizard --headless --convert-to pdf \"{0}\""
, fileName);
pdfProcess.StartInfo.WorkingDirectory = fileDir;
pdfProcess.StartInfo.RedirectStandardOutput = true;
pdfProcess.StartInfo.RedirectStandardError = true;
pdfProcess.StartInfo.UseShellExecute = false;
pdfProcess.Start();
string output = pdfProcess.StandardOutput.ReadToEnd();
converted = true;
}
catch (Exception ex)
{
converted = false;
System.Diagnostics.Debug.WriteLine(ex.ToString());
}
}

How to print PDF documents or other printable files in UWP? [duplicate]

Here's the basic premise:
My user clicks some gizmos and a PDF file is spit out to his desktop. Is there some way for me to send this file to the printer queue and have it print to the locally connected printer?
string filePath = "filepathisalreadysethere";
SendToPrinter(filePath); //Something like this?
He will do this process many times. For each student in a classroom he has to print a small report card. So I generate a PDF for each student, and I'd like to automate the printing process instead of having the user generated pdf, print, generate pdf, print, generate pdf, print.
Any suggestions on how to approach this? I'm running on Windows XP with Windows Forms .NET 4.
I've found this StackOverflow question where the accepted answer suggests:
Once you have created your files, you
can print them via a command line (you
can using the Command class found in
the System.Diagnostics namespace for
that)
How would I accomplish this?

Adding a new answer to this as the question of printing PDF's in .net has been around for a long time and most of the answers pre-date the Google Pdfium library, which now has a .net wrapper. For me I was researching this problem myself and kept coming up blank, trying to do hacky solutions like spawning Acrobat or other PDF readers, or running into commercial libraries that are expensive and have not very compatible licensing terms. But the Google Pdfium library and the PdfiumViewer .net wrapper are Open Source so are a great solution for a lot of developers, myself included. PdfiumViewer is licensed under the Apache 2.0 license.
You can get the NuGet package here:
https://www.nuget.org/packages/PdfiumViewer/
and you can find the source code here:
https://github.com/pvginkel/PdfiumViewer
Here is some simple code that will silently print any number of copies of a PDF file from it's filename. You can load PDF's from a stream also (which is how we normally do it), and you can easily figure that out looking at the code or examples. There is also a WinForm PDF file view so you can also render the PDF files into a view or do print preview on them. For us I simply needed a way to silently print the PDF file to a specific printer on demand.
public bool PrintPDF(
string printer,
string paperName,
string filename,
int copies)
{
try {
// Create the printer settings for our printer
var printerSettings = new PrinterSettings {
PrinterName = printer,
Copies = (short)copies,
};
// Create our page settings for the paper size selected
var pageSettings = new PageSettings(printerSettings) {
Margins = new Margins(0, 0, 0, 0),
};
foreach (PaperSize paperSize in printerSettings.PaperSizes) {
if (paperSize.PaperName == paperName) {
pageSettings.PaperSize = paperSize;
break;
}
}
// Now print the PDF document
using (var document = PdfDocument.Load(filename)) {
using (var printDocument = document.CreatePrintDocument()) {
printDocument.PrinterSettings = printerSettings;
printDocument.DefaultPageSettings = pageSettings;
printDocument.PrintController = new StandardPrintController();
printDocument.Print();
}
}
return true;
} catch {
return false;
}
}

You can tell Acrobat Reader to print the file using (as someone's already mentioned here) the 'print' verb. You will need to close Acrobat Reader programmatically after that, too:
private void SendToPrinter()
{
ProcessStartInfo info = new ProcessStartInfo();
info.Verb = "print";
info.FileName = #"c:\output.pdf";
info.CreateNoWindow = true;
info.WindowStyle = ProcessWindowStyle.Hidden;
Process p = new Process();
p.StartInfo = info;
p.Start();
p.WaitForInputIdle();
System.Threading.Thread.Sleep(3000);
if (false == p.CloseMainWindow())
p.Kill();
}
This opens Acrobat Reader and tells it to send the PDF to the default printer, and then shuts down Acrobat after three seconds.
If you are willing to ship other products with your application then you could use GhostScript (free), or a command-line PDF printer such as http://www.commandlinepdf.com/ (commercial).
Note: the sample code opens the PDF in the application current registered to print PDFs, which is the Adobe Acrobat Reader on most people's machines. However, it is possible that they use a different PDF viewer such as Foxit (http://www.foxitsoftware.com/pdf/reader/). The sample code should still work, though.

I know the tag says Windows Forms... but, if anyone is interested in a WPF application method, System.Printing works like a charm.
var file = File.ReadAllBytes(pdfFilePath);
var printQueue = LocalPrintServer.GetDefaultPrintQueue();
using (var job = printQueue.AddJob())
using (var stream = job.JobStream)
{
stream.Write(file, 0, file.Length);
}
Just remember to include System.Printing reference, if it's not already included.
Now, this method does not play well with ASP.NET or Windows Service. It should not be used with Windows Forms, as it has System.Drawing.Printing. I don't have a single issue with my PDF printing using the above code.
I should mention, however, that if your printer does not support Direct Print for PDF file format, you're out of luck with this method.

The following code snippet is an adaptation of Kendall Bennett's code for printing pdf files using the PdfiumViewer library. The main difference is that a Stream is used rather than a file.
public bool PrintPDF(
string printer,
string paperName,
int copies, Stream stream)
{
try
{
// Create the printer settings for our printer
var printerSettings = new PrinterSettings
{
PrinterName = printer,
Copies = (short)copies,
};
// Create our page settings for the paper size selected
var pageSettings = new PageSettings(printerSettings)
{
Margins = new Margins(0, 0, 0, 0),
};
foreach (PaperSize paperSize in printerSettings.PaperSizes)
{
if (paperSize.PaperName == paperName)
{
pageSettings.PaperSize = paperSize;
break;
}
}
// Now print the PDF document
using (var document = PdfiumViewer.PdfDocument.Load(stream))
{
using (var printDocument = document.CreatePrintDocument())
{
printDocument.PrinterSettings = printerSettings;
printDocument.DefaultPageSettings = pageSettings;
printDocument.PrintController = new StandardPrintController();
printDocument.Print();
}
}
return true;
}
catch (System.Exception e)
{
return false;
}
}
In my case I am generating the PDF file using a library called PdfSharp and then saving the document to a Stream like so:
PdfDocument pdf = PdfGenerator.GeneratePdf(printRequest.html, PageSize.A4);
pdf.AddPage();
MemoryStream stream = new MemoryStream();
pdf.Save(stream);
MemoryStream stream2 = new MemoryStream(stream.ToArray());
One thing that I want to point out that might be helpful to other developers is that I had to install the 32 bit version of the Pdfium native DLL in order for the printing to work even though I am running Windows 10 64 bit. I installed the following two NuGet packages using the NuGet package manager in Visual Studio:
PdfiumViewer
PdfiumViewer.Native.x86.v8-xfa

The easy way:
var pi=new ProcessStartInfo("C:\file.docx");
pi.UseShellExecute = true;
pi.Verb = "print";
var process = System.Diagnostics.Process.Start(pi);

This is a slightly modified solution. The Process will be killed when it was idle for at least 1 second. Maybe you should add a timeof of X seconds and call the function from a separate thread.
private void SendToPrinter()
{
ProcessStartInfo info = new ProcessStartInfo();
info.Verb = "print";
info.FileName = #"c:\output.pdf";
info.CreateNoWindow = true;
info.WindowStyle = ProcessWindowStyle.Hidden;
Process p = new Process();
p.StartInfo = info;
p.Start();
long ticks = -1;
while (ticks != p.TotalProcessorTime.Ticks)
{
ticks = p.TotalProcessorTime.Ticks;
Thread.Sleep(1000);
}
if (false == p.CloseMainWindow())
p.Kill();
}

System.Diagnostics.Process.Start can be used to print a document. Set UseShellExecute to True and set the Verb to "print".

You can try with GhostScript like in this post:
How to print PDF on default network printer using GhostScript (gswin32c.exe) shell command

I know Edwin answered it above but his only prints one document. I use this code to print all files from a given directory.
public void PrintAllFiles()
{
System.Diagnostics.ProcessStartInfo info = new System.Diagnostics.ProcessStartInfo();
info.Verb = "print";
System.Diagnostics.Process p = new System.Diagnostics.Process();
//Load Files in Selected Folder
string[] allFiles = System.IO.Directory.GetFiles(Directory);
foreach (string file in allFiles)
{
info.FileName = #file;
info.CreateNoWindow = true;
info.WindowStyle = System.Diagnostics.ProcessWindowStyle.Hidden;
p.StartInfo = info;
p.Start();
}
//p.Kill(); Can Create A Kill Statement Here... but I found I don't need one
MessageBox.Show("Print Complete");
}
It essentually cycles through each file in the given directory variable Directory - > for me it was #"C:\Users\Owner\Documents\SalesVaultTesting\" and prints off those files to your default printer.

this is a late answer, but you could also use the File.Copy method of the System.IO namespace top send a file to the printer:
System.IO.File.Copy(filename, printerName);
This works fine

You can use the DevExpress PdfDocumentProcessor.Print(PdfPrinterSettings) Method.
public void Print(string pdfFilePath)
{
if (!File.Exists(pdfFilePath))
throw new FileNotFoundException("No such file exists!", pdfFilePath);
// Create a Pdf Document Processor instance and load a PDF into it.
PdfDocumentProcessor documentProcessor = new PdfDocumentProcessor();
documentProcessor.LoadDocument(pdfFilePath);
if (documentProcessor != null)
{
PrinterSettings settings = new PrinterSettings();
//var paperSizes = settings.PaperSizes.Cast<PaperSize>().ToList();
//PaperSize sizeCustom = paperSizes.FirstOrDefault<PaperSize>(size => size.Kind == PaperKind.Custom); // finding paper size
settings.DefaultPageSettings.PaperSize = new PaperSize("Label", 400, 600);
// Print pdf
documentProcessor.Print(settings);
}
}

public static void PrintFileToDefaultPrinter(string FilePath)
{
try
{
var file = File.ReadAllBytes(FilePath);
var printQueue = LocalPrintServer.GetDefaultPrintQueue();
using (var job = printQueue.AddJob())
using (var stream = job.JobStream)
{
stream.Write(file, 0, file.Length);
}
}
catch (Exception)
{
throw;
}
}

Is there any free library to covert doc to pdf without using Microsoft.Office.Interop.Word in c# environment

I meet a problem about how to using c# without using Microsoft.Office.Interop.Word to covert doc to pdf. I have tried some third party solution, like spire.Doc, but they are not free, and also I found DocX_Doc in nuget, but it seems there is no tutorial about that.Is anyone knows a free solution for this problem, or any instruction about DocX_Doc. Thanks a lot.

you can use libreOffice is free license under apache 2.0 https://www.libreoffice.org/
i already tested it and it's working perefectly just you need to download soffice.exe file to convert to pdf you also can convert docx to image and other type.
here my example code that i tested it:
static string getLibreOfficePath()
{
switch (Environment.OSVersion.Platform)
{
case PlatformID.Unix:
return "/usr/bin/soffice";
case PlatformID.Win32NT:
string binaryDirectory =
System.IO.Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
return #"C:\Program Files\LibreOffice\program\soffice.exe";
default:
throw new PlatformNotSupportedException("Your OS is not supported");
}
}
static void Main(string[] args)
{
string libreOfficePath = getLibreOfficePath();
ProcessStartInfo procStartInfo = new ProcessStartInfo(libreOfficePath,
string.Format("--convert-to pdf C:\\test.docx")); //test.docx => input path
procStartInfo.RedirectStandardOutput = true;
procStartInfo.UseShellExecute = false;
procStartInfo.CreateNoWindow = true;
procStartInfo.WorkingDirectory = Environment.CurrentDirectory;
Process process = new Process() { StartInfo = procStartInfo, };
process.Start();
process.WaitForExit();
// Check for failed exit code.
if (process.ExitCode != 0)
{
throw new LibreOfficeFailedException(process.ExitCode);
}
}
i hope it's helpfull for you.
Thanks.

Update:
as mentioned above by #saleem you need to use https://www.libreoffice.org/
The following solution is obsolete as those libraries are not free anymore:
For DocX library DocX and here is a sample on how to convert from word to PDF Converting .docx into (.doc, .pdf, .html) "Free"
You can check this DLL as well enter link description here
I use SautinSoft.UseOffice to do this, it's simple and easy to use but costs about 350$. Here is the link to a full tutorial:
Convert DOC (DOCX) file to PDF file in C# - Step by Step "Not Free"

while downloading html in pdf,by using third party tool to convert an html page to pdf. getting error- Conversion error: Could not open url

while downloading html code into pdf in selectpdf software. im getting error saying - "Conversion error: Could not open url".im using selectpdf for converting html code to pdf. what is the base url i have to give .
using SelectPdf;
public partial class HtmlcodePrint : System.Web.UI.Page
{
string TxtHtmlCode;
protected void Page_Load(object sender, EventArgs e)
{
if (!IsPostBack)
{
TxtHtmlCode = #"<html>
<body>
Hello World from selectpdf.com.
</body>
</html>
";
}
}
protected void Btndownloadpdf_Click(object sender, EventArgs e)
{
// read parameters from the webpage
string htmlString = TxtHtmlCode;
string baseUrl = "http://localhost:51868/HtmlcodePrint.aspx";
string pdf_page_size ="A4";
PdfPageSize pageSize = (PdfPageSize)Enum.Parse(typeof(PdfPageSize),
pdf_page_size, true);
string pdf_orientation = "Portrait";
PdfPageOrientation pdfOrientation =
(PdfPageOrientation)Enum.Parse(typeof(PdfPageOrientation),
pdf_orientation, true);
int webPageWidth = 1024;
try
{
webPageWidth = Convert.ToInt32("1024");
}
catch { }
int webPageHeight = 0;
try
{
webPageHeight = Convert.ToInt32("777");
}
catch { }
// instantiate a html to pdf converter object
HtmlToPdf converter = new HtmlToPdf();
// set converter options
converter.Options.PdfPageSize = pageSize;
converter.Options.PdfPageOrientation = pdfOrientation;
converter.Options.WebPageWidth = webPageWidth;
converter.Options.WebPageHeight = webPageHeight;
// create a new pdf document converting an url
PdfDocument doc = converter.ConvertHtmlString(htmlString, baseUrl);
// save pdf document
doc.Save(Response, false, "Sample.pdf");
// close pdf document
doc.Close();
}
}

I know this is old, but I've been working with SelectPdf for a couple of days, so I'll throw in my 2 cents.
You probably don't need a baseUrl...
You don't have to give any baseUrl at all to the ConvertHtmlString function. You can just pass it the html string you want to convert and that's it.
Unless...
You only need to pass it a baseUrl if the html you're converting has relative paths in the external references (example: if you were referencing a stylesheet and wanted to use a relative path, you could provide the baseUrl to show where you wanted the stylesheet to be relative to). It's just so the converter can create the full absolute paths from the relative paths.
So...
If you don't need that functionality or just don't have external references in your html, then you can just use
converter.ConvertHtmlString(htmlString);
Also...
doc.Save(Response, false, "Sample.pdf");
may not be what you're looking for either. I only say this because the comments look like the same ones on the examples on the site for SelectPDF, so I'm assuming you copied the code from there (which is what I originally did too), in which case I want to let you know you don't have to save your PDF doc with that particular version of Save. It actually has 3 overloads to allow you to save your doc as:
a byte array (default)
a stream
a file
an HTTP response (the one you're using now, as shown in the examples from the site)
So, like I pointed out, you're using the one that saves the PDF as a HTTP response, so if you're wanting to save it as an actual PDF file directly, you'll need to change it to
doc.Save(fileName)
with the fileName variable as the absolute or relative path or file name you want to save the PDF to.
Hope this helps

Invoking OLE in C# through OleCreateFromFile does not work for pdf files

I am trying to embed pdf files into OPEN XML document. This requires creating *.bin files. I dont want to use automation.
Approach which Ive taken from this question works for all file types Ive tested except *.pdf.
For some reason pdf files always get the result from OleCreateFromFile(..) to be 0x80004005 and the pOle is NULL.
I am new on the field of invoking and OLE. What could be a reason for this approach not working for PDF?
(I have newest Adobe Reader, Win8, invoking into Ole32.dll, projects build target is x86 and Ive test to call CoUninitialize() and CoInitializeEx((System.IntPtr)null, OLE32.CoInit.ApartmentThreaded), I am able to embed pdf files in MSWORD application).
Here is a function that I use for it:
public static string ExportOleFile(string _inputFileName, string oleOutputFileName, string emfOutputFileName)
{
StringBuilder resultString = new StringBuilder();
string newInput = MultibyteToUnicodeNETOnly(_inputFileName, 1252);
Microsoft.VisualStudio.OLE.Interop.IStorage storage;
var result = OLE32.StgCreateStorageEx(oleOutputFileName,
Convert.ToInt32(OLE32.STGM.STGM_READWRITE | OLE32.STGM.STGM_SHARE_EXCLUSIVE | OLE32.STGM.STGM_CREATE | OLE32.STGM.STGM_TRANSACTED),
Convert.ToInt32(OLE32.STGFMT.STGFMT_DOCFILE),
0,
IntPtr.Zero,
IntPtr.Zero,
ref OLE32.IID_IStorage,
out storage
);//vytvoří bin
resultString.AppendLine("CreateStorageEx Result: " + result.ToString());
var CLSID_NULL = Guid.Empty;
Microsoft.VisualStudio.OLE.Interop.FORMATETC f = new FORMATETC();
Microsoft.VisualStudio.OLE.Interop.IOleObject pOle;
result = OLE32.OleCreateFromFile(
ref CLSID_NULL,
newInput,
ref OLE32.IID_IOleObject,
(uint)Microsoft.VisualStudio.OLE.Interop.OLERENDER.OLERENDER_NONE,
ref f,
null,
storage,
out pOle
);
resultString.AppendLine("OleCreateFromFile Result: " + result.ToString());
try
{
result = OLE32.OleRun(pOle);
}
catch (Exception ex)
{
resultString.AppendLine(ex.ToString());
return resultString.ToString();
}
resultString.AppendLine("OleRun Result: " + result.ToString());
try
{
IntPtr unknownFromOle = Marshal.GetIUnknownForObject(pOle);
IntPtr unknownForDataObj;
Marshal.QueryInterface(unknownFromOle, ref OLE32.IID_IDataObject, out unknownForDataObj);
var pdo = Marshal.GetObjectForIUnknown(unknownForDataObj) as System.Runtime.InteropServices.ComTypes.IDataObject;
var fetc = new System.Runtime.InteropServices.ComTypes.FORMATETC();
fetc.cfFormat = (short)OLE32.CLIPFORMAT.CF_ENHMETAFILE;
fetc.dwAspect = System.Runtime.InteropServices.ComTypes.DVASPECT.DVASPECT_CONTENT;
fetc.lindex = -1;
fetc.ptd = IntPtr.Zero;
fetc.tymed = System.Runtime.InteropServices.ComTypes.TYMED.TYMED_ENHMF;
var stgm = new System.Runtime.InteropServices.ComTypes.STGMEDIUM();
stgm.unionmember = IntPtr.Zero;
stgm.tymed = System.Runtime.InteropServices.ComTypes.TYMED.TYMED_ENHMF;
pdo.GetData(ref fetc, out stgm);
var hemf = GDI32.CopyEnhMetaFile(stgm.unionmember, emfOutputFileName);
storage.Commit((int)OLE32.STGC.DEFAULT);
pOle.Close(0);
GDI32.DeleteEnhMetaFile(stgm.unionmember);
GDI32.DeleteEnhMetaFile(hemf);
}
catch (Exception ex)
{
resultString.AppendLine(ex.ToString());
return resultString.ToString();
}
return resultString.ToString();
}

Actually for embedding files in OpenXML, it is necessary to work with the good old OLE functions. There is no other way around as you need to get two pieces:
a file that is going to be embedded
a picture that shows the content of the file, usually a screenshot of the first page
I did write a blog entry about that: Embedd pdf into powerpoint by usage of openxml. This is not exactly your requirement but it works identically.
There are two issues with pdfs when it comes to embedding:
Embedded pdf documents have a different content than the original pdf file. For all other OLE formats I know (excel, word, powerpoint, ...) this is not the case. You can just use the file on the hard disk, for pdf you cannot.
You need to take a picture of the first page. You could use pdfium or the like - there are quite some tools out there for rendering pdf, but adobe reader is free, and does the job 100%.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.