Windows 8.1 Pro, Visual Studio 2015 Update 3, C#, .NET Framework 4.5. Ghostscript.NET (latest), GhostScript 9.20.
I'm converting a PDF to a PDF. Hah. Well, I'm making an "editable" PDF "hard" PDF that can't be edited and is of lower quality. The process is I take the editable PDF, save it out as x-pages of PNG files, convert those PNG files to a multipage TIFF, and then convert the multipage TIFF to the PDF I need.
This worked just fine with Visual Studio 2012, one version earlier of GhostScript .NET and GS 9.10.
public static Tuple<string, List<string>> CreatePNGFromPDF(string inputFile, string outputfile)
{
Tuple<string, List<string>> t = null;
List<string> fileList = new List<string>();
string message = "Success";
string outputFileName = string.Empty;
int desired_x_dpi = 96;
int desired_y_dpi = 96;
try
{
using (GhostscriptViewer gsViewer = new GhostscriptViewer())
{
gsViewer.Open(inputFile);
using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer(gsViewer))
{
for (int pageNumber = 1; pageNumber <= rasterizer.PageCount; pageNumber++)
{
using (System.Drawing.Image img = rasterizer.GetPage(desired_x_dpi, desired_y_dpi, pageNumber))
{
outputFileName = outputfile.Replace(".png", string.Empty) + "_page_" + pageNumber.ToString() + ".png";
img.Save(outputFileName, ImageFormat.Png);
if (!fileList.Contains(outputFileName))
{
fileList.Add(outputFileName);
}
}
}
}
}
}
catch (Exception ex)
{
message = ex.Message;
}
t = new Tuple<string, List<string>>(message, fileList);
return t;
}
This now fails on this line:
using (System.Drawing.Image img = rasterizer.GetPage(desired_x_dpi, desired_y_dpi, pageNumber))
when processing the second page. The first page works okay.
I downloaded the source for GhostScript.NET, added it to my solution, debugged, etc., and spent a good long while trying to figure this out.
I then decided to separate out the functionality and make the bare minimum available for me to examine further in a simple Console application:
static void Main(string[] args)
{
int xDpi = 96;
int yDpi = 96;
string pdfFile = #"Inputfilenamehere.pdf";
GhostscriptVersionInfo gsVersionInfo = GhostscriptVersionInfo.GetLastInstalledVersion(GhostscriptLicense.GPL | GhostscriptLicense.AFPL, GhostscriptLicense.GPL);
List<GhostscriptVersionInfo> gsVersionInfoList = GhostscriptVersionInfo.GetInstalledVersions(GhostscriptLicense.GPL | GhostscriptLicense.AFPL);
try
{
using (GhostscriptViewer gsViewer = new GhostscriptViewer())
{
gsViewer.Open(pdfFile);
using (GhostscriptRasterizer gsRasterizer = new GhostscriptRasterizer(gsViewer))
{
int pageCount = gsRasterizer.PageCount;
for (int i = 0; i < pageCount; i++)
{
Image img = gsRasterizer.GetPage(xDpi, yDpi, i + 1);
}
}
}
}
catch(Exception ex)
{
Console.WriteLine(ex.Message);
}
}
Lo and behold, no problems. The difference is that I'm not putting declaration of my Image in the using statement.
I always try to be a good boy developer and use a using statement whenever the class implements IDisposable.
So, I removed the use of the using and I get the lower-quality PDF's that I've always desired. My life is good now.
using (GhostscriptViewer gsViewer = new GhostscriptViewer())
{
gsViewer.Open(inputFile);
using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer(gsViewer))
{
for (int pageNumber = 1; pageNumber <= rasterizer.PageCount; pageNumber++)
{
System.Drawing.Image img = rasterizer.GetPage(desired_x_dpi, desired_y_dpi, pageNumber);
outputFileName = outputfile.Replace(".png", string.Empty) + "_page_" + pageNumber.ToString() + ".png";
img.Save(outputFileName, ImageFormat.Png);
if (!fileList.Contains(outputFileName))
{
fileList.Add(outputFileName);
}
}
}
}
Note that if I call img.Dispose() at the end of the for loop, I get the same error again!
My best guess is that my issue is not a GhostScript or GhostScript.NET issue. Am I being a bonehead for insisting on blindly using "using" statements if the class implements IDisposable? I've always understood that it's best practice to wrap anything that implements IDisposable with a using statement to forgo leaks, etc.
Hence, my question: Any ideas why I get the "Parameter is invalid" exception when I initialize the System.Drawing.Image class within the using statement but not when I don't? I'd love to understand this more.
Better yet, if anyone knows how I can get this functionality and also ensure I'm properly disposing my object, that would be the best.
I didn't find much about this particular topic when I searched for information. I did find one other StackOverflow post about someone using a graphic object in a using statement with the same error. I wonder if there is a relationship. I also note that I should be using Dispose(), but that appears to be causing the problem, and I need this to work.
FYI, for anyone interested, the actual error occurs here in GhostscriptInterprester.cs in the GhostScript.NET code:
Method: public void Run(string str)
str is "Page pdfshowpage_init pdfshowpage_finish"
// GSAPI: run the string
int rc_run = _gs.gsapi_run_string(_gs_instance, str, 0, out exit_code);
I found the root cause of my failure at least. My GhostscriptRasterizer object had a value of '0' set for the height points and width points.
var rasterizer = new GhostscriptRasterizer();
rasterizer.CustomSwitches.Add("-dDEVICEWIDTHPOINTS=" + widthPoints);
rasterizer.CustomSwitches.Add("-dDEVICEHEIGHTPOINTS=" + heightPoints);
Once I set both height and width to a valid non-zero value, the issue got fixed.
Related
I have an existing program that does some processing a .pdf file and splitting it into multiple .pdf files based on looking for barcodes on the pages.
The program uses ImageMagick and C#.
I want to change it from outputting pdfs to outputting tifs. Look for the comment in the code below for where I would guess the change would be made.
I included the ImageMagick tag because someone might offer a commandline option that someone else can help me convert to C#.
private void BurstPdf(string bigPdfName, string targetfolder)
{
bool outputPdf = true; // change to false to output tif.
string outputExtension = "";
var settings = new MagickReadSettings { Density = new Density(200) };
string barcodePng = Path.Combine("C:\TEMP", "tmp.png");
using (MagickImageCollection pdfPageCollection = new MagickImageCollection())
{
pdfPageCollection.Read(bigPdfName, settings);
int inputPageCount = 0;
int outputPageCount = 0;
int outputFileCount = 0;
MagickImageCollection resultCollection = new MagickImageCollection();
string barcode = "";
string resultName = "";
IBarcodeReader reader = new BarcodeReader();
reader.Options.PossibleFormats = new List<BarcodeFormat>();
reader.Options.PossibleFormats.Add(BarcodeFormat.CODE_39);
reader.Options.TryHarder = false;
foreach (MagickImage pdfPage in pdfPageCollection)
{
MagickGeometry barcodeArea = getBarCodeArea(pdfPage);
IMagickImage barcodeImg = pdfPage.Clone();
barcodeImg.ColorType = ColorType.Bilevel;
barcodeImg.Depth = 1;
barcodeImg.Alpha(AlphaOption.Off);
barcodeImg.Crop(barcodeArea);
barcodeImg.Write(barcodePng);
inputPageCount++;
using (var barcodeBitmap = new Bitmap(barcodePng))
{
var result = reader.Decode(barcodeBitmap);
if (result != null)
{
// found a first page because it has bar code.
if (result.BarcodeFormat.ToString() == "CODE_39")
{
if (outputFileCount != 0)
{
// write out previous pages.
if (outputPdf) {
outputExtension = ".pdf";
} else {
// What do I put here to output a g4 compressed tif?
outputExtension = ".tif";
}
resultName = string.Format("{0:D4}", outputFileCount) + "-" + outputPageCount.ToString() + "-" + barcode + outputExtension;
resultCollection.Write(Path.Combine(targetfolder, resultName));
resultCollection = new MagickImageCollection();
}
barcode = standardizePhysicalBarCode(result.Text);
outputFileCount++;
resultCollection.Add(pdfPage);
outputPageCount = 1;
}
else
{
Console.WriteLine("WARNING barcode is not of type CODE_39 so something is wrong. check page " + inputPageCount + " of " + bigPdfName);
if (inputPageCount == 1)
{
throw new Exception("barcode not found on page 1. see " + barcodePng);
}
resultCollection.Add(pdfPage);
outputPageCount++;
}
}
else
{
if (inputPageCount == 1)
{
throw new Exception("barcode not found on page 1. see " + barcodePng);
}
resultCollection.Add(pdfPage);
outputPageCount++;
}
}
if (File.Exists(barcodePng))
{
File.Delete(barcodePng);
}
}
if (resultCollection.Count > 0)
{
if (outputPdf) {
outputExtension = ".pdf";
} else {
// What do I put here to output a g4 compressed tif?
outputExtension = ".tif";
}
resultName = string.Format("{0:D4}", outputFileCount) + "-" + outputPageCount.ToString() + "-" + barcode + outputExtension;
resultCollection.Write(Path.Combine(targetfolder, resultName));
outputFileCount++;
}
}
}
[EDIT] The above code is what I am using (which some untested modifications) to split a .pdf into other .pdfs. I want to know how to modify this code to output tiffs. I put a comment in the code where I think the change would go.
[EDIT] So encouraged by #fmw42 I just ran the code with the .tif extension enabled. Looks like it did convert to a .tif, but the tif is not compressed. I am surprised that IM just configures the output based on the extension name of the file. Handy I guess, but just seems a little loose.
[EDIT] I figured it out. Although counter-intuitive ones sets the compression on the read of the file. I am reading a .pdf but I set the compression to Group for like this:
var settings = new MagickReadSettings { Density = new Density(200), Compression = CompressionMethod.Group4 };
The thing I learned was that simply naming the output file .tif tells IM to output a tif. That is a handy way to do it, but it just seems sloppy.
I'm working on an app that uses Bing's API to search and download images.
Bing's API provides a set of image links and I iterate over them and download each one.
The problem that I'm having is that sometimes the downloaded file size is 0Kb.
I assume that happens because WebClient first creates the filename and then tries to write to it. So when it can't write to it for some reason this happens. The problem is that it happens without throwing an exception so my 'Catch' statement can't catch this and delete the file.
public void imageFetcher(string performerName, int maxNumberOfImages, RichTextBox richTextBox)
{
string performersDirPath = Environment.CurrentDirectory + #"\Performers\";
string performerPath = performersDirPath + performerName + #"\";
if (!Directory.Exists(performersDirPath))
{
Directory.CreateDirectory(performersDirPath);
}
if (!Directory.Exists(performerPath))
{
Directory.CreateDirectory(performerPath);
}
// Searching for Images using bing api
IEnumerable<Bing.ImageResult> bingSearch = bingImageSearch(performerName);
int i = 0;
foreach (var result in bingSearch)
{
downloadImage(result.MediaUrl, performerPath + performerName + i + ".jpg",richTextBox);
i++;
if (i == maxNumberOfImages)
{
break;
}
}
}
The download method:
public void downloadImage(string imgUrl, string saveDestination, RichTextBox richTextBox)
{
if (File.Exists(saveDestination))
{
richTextBox.ForeColor = System.Drawing.Color.Red;
richTextBox.AppendText("The File: " + saveDestination + "Already exists");
}
else
{
try
{
using (WebClient client = new WebClient())
{
client.DownloadFileCompleted += new AsyncCompletedEventHandler(((sender, e) => downloadFinished(sender, e, saveDestination , richTextBox)));
Uri imgURI = new Uri(imgUrl, UriKind.Absolute);
client.DownloadFileAsync(imgURI, saveDestination);
}
}
catch (Exception e)
{
richTextBox.AppendText("There was an exception downloading the file" + imgUrl);
richTextBox.AppendText("Deleteing" + saveDestination);
File.Delete(saveDestination);
richTextBox.AppendText("File deleted!");
}
}
}
This happens also when I try to wait for the client to finish using:
client.DownloadFileAsync(imgURI, saveDestination);
while (client.IsBusy)
{
}
Can anyone please tell me what I'm doing wrong?
In other simular question the solution was to keep the Webclient instance open until download is finished.. I'm doing this with this loop:
while (client.IsBusy){}
Yet the results are the same.
Update:
I resorted to not use webclient, instead I used this code:
try
{
byte[] lnBuffer;
byte[] lnFile;
using (BinaryReader lxBR = new BinaryReader(stream))
{
using (MemoryStream lxMS = new MemoryStream())
{
lnBuffer = lxBR.ReadBytes(1024);
while (lnBuffer.Length > 0)
{
lxMS.Write(lnBuffer, 0, lnBuffer.Length);
lnBuffer = lxBR.ReadBytes(1024);
}
lnFile = new byte[(int)lxMS.Length];
lxMS.Position = 0;
lxMS.Read(lnFile, 0, lnFile.Length);
}
using (System.IO.FileStream lxFS = new FileStream(saveDestination, FileMode.Create))
{
lxFS.Write(lnFile, 0, lnFile.Length);
}
This solves the problem almost complelty, there are still one or two 0KB files but I assume it's because of network errors.
To see possible exceptions - try changing DownloadFileAsync to just DownloadFile - my problem was "Can not create SSL/TLS secure channel". Hope this will help someone.
I have to convert into a single pdf a large number (but undefined) pdf into one for this, I'm using the code PDFsharp here.
// Get some file names
string[] files = filesToPrint.ToArray();
// Open the output document
PdfDocument outputDocument = new PdfDocument();
PdfPage newPage;
int nProcessedFile = 0;
int nMemoryFile = 5;
int nStepConverted = 0;
String sNameLastCombineFile = "";
// Iterate files
foreach (string file in files)
{
// Open the document to import pages from it.
PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);
// Iterate pages
int count = inputDocument.PageCount;
for (int idx = 0; idx < count; idx++)
{
// Get the page from the external document...
PdfPage page = inputDocument.Pages[idx];
// ...and add it to the output document.
outputDocument.AddPage(page);
}
nProcessedFile++;
if (nProcessedFile >= nMemoryFile)
{
//nProcessedFile = 0;
//nStepConverted++;
//sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";
//outputDocument.Save(sNameLastCombineFile);
//outputDocument.Close();
}
}
// Save the document...
const string filename = "ConcatenatedDocument1_tempfile.pdf";
outputDocument.Save(filename);
// ...and start a viewer.
Process.Start(filename);
For small numbers of files the code works but then at some point
generates an exception of out of memory
is there a solution?
p.s
I was thinking of saving the files in step and then the remaining aggiungingere so liebrare memory but I can not find the way.
UPDATE1:
if (nProcessedFile >= nMemoryFile)
{
nProcessedFile = 0;
//nStepConverted++;
sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";
outputDocument.Save(sNameLastCombineFile);
outputDocument.Close();
outputDocument = PdfReader.Open(sNameLastCombineFile,PdfDocumentOpenMode.Modify);
}
UPDATE 2 versione 1.32
Complete example
Error on line:
PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);
Text error:
Cannot handle iref streams. The current implementation of PDFsharp cannot handle this PDF feature introduced with Acrobat 6.
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
List<String> filesToPrint = new List<string>();
filesToPrint = Directory.GetFiles(#"D:\Downloads\RACCOLTA\FILE PDF", "*.pdf").ToList();
// Get some file names
string[] files = filesToPrint.ToArray();
// Open the output document
PdfDocument outputDocument = new PdfDocument();
PdfPage newPage;
int nProcessedFile = 0;
int nMemoryFile = 5;
int nStepConverted = 0;
String sNameLastCombineFile = "";
try
{
// Iterate files
foreach (string file in files)
{
// Open the document to import pages from it.
PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);
// Iterate pages
int count = inputDocument.PageCount;
for (int idx = 0; idx < count; idx++)
{
// Get the page from the external document...
PdfPage page = inputDocument.Pages[idx];
// ...and add it to the output document.
outputDocument.AddPage(page);
}
nProcessedFile++;
if (nProcessedFile >= nMemoryFile)
{
nProcessedFile = 0;
//nStepConverted++;
sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";
outputDocument.Save(sNameLastCombineFile);
outputDocument.Close();
inputDocument = PdfReader.Open(sNameLastCombineFile , PdfDocumentOpenMode.Modify);
}
}
// Save the document...
const string filename = "ConcatenatedDocument1_tempfile.pdf";
outputDocument.Save(filename);
// ...and start a viewer.
Process.Start(filename);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.ReadKey();
}
}
}
}
UPDATE3
Code that generate exception out of memory
int count = inputDocument.PageCount;
for (int idx = 0; idx < count; idx++)
{
// Get the page from the external document...
newPage = inputDocument.Pages[idx];
// ...and add it to the output document.
outputDocument.AddPage(newPage);
newPage.Close();
}
I can not exactly which row general exception
I had a simular issue, saving, closing and reopening the PdfDocument did not really help.
I am adding al lot (100+) large (upto 5Mb) images (tiff, jpg, etc) to a pdf document where every images has its own page. It crashed around image #50. After the save-close-reopen it did finish the whole document but was still getting close to max memory, around 3Gb. Some more images and it would still crash.
After more refining, I implemented a using for the XGraphics object, it was a little better again but not much.
The big step forward was disposing of the XImage within the loop! After that the application never used more than 100-200Kb, I removed the save-close-reopen for the PdfDocument and it was no problem.
After saving and closing outputDocument (the code is commented out in your snippet), you have to open outputDocument again, using PdfDocumentOpenMode.Modify.
It could help to add using(...) for the inputDocument.
If your code is running as a 32-bit process, then switching to 64 bit will allow your process to use more than 2 GB of RAM (assuming your computer has more than 2 GB RAM).
Update: The message "Cannot handle iref streams" means you have to use PDFsharp 1.50 Prerelease, available on NuGet.
iTextSharp - How to convert PdfPTable to JPEG or other image format?
I believe iTextSharp does not currently support rendering PDF into image files. Ghostscript supports converting PDF files to images. There is a good tutorial here available to convert PDF files to images. Also you can use rendering object like this one.
iTextSharp is only for creating PDF Documents.
There are many other DLL's that can be used to convert PDF to JPG. The most preferred is Ghostscript(GS). you can use the foll. C# Code with GS dll
public static void PdfToJpg(string input, string output)
{
PdfToImage.PDFConvert pp = new PDFConvert();
pp.OutputFormat = "jpeg"; //format
pp.JPEGQuality = 100; //100% quality
pp.ResolutionX = 300; //dpi
pp.ResolutionY = 300;
pp.FirstPageToConvert = 1; //pages you want
pp.LastPageToConvert = 1;
pp.Convert(input , output );
}
namespace PdfToJpeg
{
{
PDFConvert converter = new PDFConvert();
public Form1()
{
InitializeComponent();
}
try
{
PdfToJpg("c:\abc.pdf","c:\" + "output.jpg");
MessageBox.Show("Files Converted");
}
catch (Exception ex)
{
MessageBox.Show("Exception Error Occured... " + ex.Message.ToString());
}
}
}
I have developed an application that loaded many images in a listview using ImageList in c# .net framework 4. The images are also compressed. When many many images are loaded and compressed then it takes a long time. So I call the method in backgroundworker. In the backgroundworker I had to add images to ImageList and add ImageList to ListView. So I have used safeinvoke() method listView1.SafeInvoke(d=>d.Items.Add(item)).
Everything works fine. Images are displayed one by one in the listview.
But the release of the application doesn’t work properly in some pc and properly works in some other pc. Doesn’t work properly means, If 100 images are browsed using OpenFileDialog to load then some images are loaded and added to listview and then the loading is automatically stopped without adding all images to the listview and no exception shows.
I have spent many times to solve this problem but couldn’t figure out the problem. . Where is the problem? Can anybody help me?
private void bgwLoading_DoWork(object sender, DoWorkEventArgs e)
{
ArrayList a = (ArrayList)e.Argument;
string[] fileNames = (string[])a[0];
this.loadMultiImages(fileNames);
}
private void loadMultiImages(string[] fileNames)
{
int i = 1;
int totalFiles = fileNames.Count();
foreach (string flName in fileNames)
{
if (!flName.Contains("Thumbs.db"))
{
Bitmap newBtmap = (Bitmap)Image.FromFile(flName);
FileInfo fi = new FileInfo(flName);
long l = fi.Length;
if (l > compressSize)
{
newBtmap = resizeImage(newBtmap, 1024,768) ;
newBtmap = saveJpeg(IMAGE_PATH + (SCANNING_NUMBER +
) + ".jpg", newBtmap, IMAGE_QUALITY);
}
else
{
File.Copy(flName, TEMP_IMAGE_PATH + (SCANNING_NUMBER + 1) + ".jpg");
}
if (!bgwLoading.CancellationPending)
{
CommonInformation.SCANNING_NUMBER++;
this.SafeInvoke(d => d.addItemToLvImageContainer(newBtmap));
bgwLoading.ReportProgress((int)Math.Round((double)i / (double)
(totalFiles) * 100));
i++;
}
}
}
}
}
public void addItemToLvImageContainer(Bitmap newBtmap)
{
imageList.Images.Add(newBtmap);
ListViewItem item;
item = new ListViewItem();
item.ImageIndex = SCANNING_NUMBER - 1;
item.Text = SCANNING_NUMBER.ToString();
lvImageContainer.Items.Add(item);
lvImageContainer.Items[item.ImageIndex].Focused = true;
}
To find out the error I have modified the code as follows:
I have commented the two lines
//newBtmap = resizeImage(newBtmap, 1024, 768);
// newBtmap = saveJpeg(IMAGE_PATH + scanning_number + ".jpg", newBtmap, Image_Quality );
and added try-catch as follows:
try
{
Bitmap newBtmap = (Bitmap)Image.FromFile(flName);
File.Copy(flName, CommonInformation.TEMP_IMAGE_PATH +
(CommonInformation.SCANNING_NUMBER + 1) + ".jpg");
if (!bgwLoading.CancellationPending)
{
this.SafeInvoke(d => d.imageList.Images.Add(newBtmap));
ListViewItem item;
item = new ListViewItem();
CommonInformation.SCANNING_NUMBER++;
item.ImageIndex = CommonInformation.SCANNING_NUMBER - 1;
item.Text = CommonInformation.SCANNING_NUMBER.ToString();
this.SafeInvoke(d => d.lvImageContainer.Items.Add(item));
bgwLoading.ReportProgress((int)Math.Round((double)i /
(double)(totalFiles) * 100));
this.safeInvoke(d=>d.addItemImageContainer(newBtmap))
catch (Exception ex)
{
MessageBox.Show( ex.Message);
}
It shows the error message after loading some images as "OutOfMemoryException"
Most probably the following line creates the exception:
Bitmap newBtmap = (Bitmap)Image.FromFile(flName);
But the image files are not corrupted and their file extension is .JPG.
How to get rid of this problem?
I have no answer, but i have some suggestions:
Check .NET framework version on computers with problems
Check, if you have permissions for files you trying to read
Use "try-catch" when you accessing files
And questions:
Is this project written in older version of .NET and migrated/upgraded to .NET 4.0?
Are you using any non-built-in assemblies or external dll's for image processing?