I have a PDF file with some images, which I want to replace with some other PDF. The code goes through the pdf and gets the image references:
PdfDocument pdf = new PdfDocument(new PdfReader(args[0]), new PdfWriter(args[1]));
for(int i=1; i<=pdf.GetNumberOfPages(); ++i)
{
PdfDictionary pageDict = pdf.GetPage(i).GetPdfObject();
PdfDictionary resources = pageDict.GetAsDictionary(PdfName.Resources);
PdfDictionary xObjects = resources.GetAsDictionary(PdfName.XObject);
foreach (PdfName imgRef in xObjects.KeySet())
{
// image reference
}
}
For all my images I have a corresponding PDF which I would like to replace the image with. What I tried is to Put the other PDF (which is always a single page) as object by:
PdfDocument other = new PdfDocument(new PdfReader("replacement.pdf"));
xObjects.Put(imgRef, other.GetFirstPage().GetPdfObject().Clone());
But while closing the PdfDocument an exception is thrown:
iText.Kernel.PdfException: 'Pdf indirect object belongs to other PDF document. Copy object to current pdf document.'
How can I achieve to replace the image with (the content of) another PDF?
Update
I also tried a few other approaches, which maybe improved results. To overcome the previous error message, I copy the page to the original pdf by:
var page = other.GetFirstPage().CopyTo(pdf);
However, replacing the xObject doesn't work:
xObjects.Put(imgRef, page.GetPdfObject());
Results in a corrupted PDF.
To just copy the original page into another document to be used as an image replacement, you can use PdfPage#CopyAsFormXObject.
So let's assume we have this PDF as a template and we want to replace the image of a desert with the contents of another PDF:
Let's also assume the PDF that we want to use as a replacement looks as follows:
The issue is that if we blindly replace the original image with the contents of the PDF, chances are we will get something like this:
So we will get a feeling that everything worked well while we still have a bad visual result. The issue is that coordinates work a bit differently for plain raster images and vector XObjects (PDF replacements). So we also need to adjust the transformation matrix (/Matrix key) of our newly created XObject.
So the code could look like this:
PdfDocument pdf = new PdfDocument(new PdfReader(#"template.pdf"), new PdfWriter(#"out.pdf"));
for(int i=1; i<=pdf.GetNumberOfPages(); ++i) {
PdfDictionary pageDict = pdf.GetPage(i).GetPdfObject();
PdfDictionary resources = pageDict.GetAsDictionary(PdfName.Resources);
PdfDictionary xObjects = resources.GetAsDictionary(PdfName.XObject);
IDictionary<PdfName, PdfStream> toReplace = new Dictionary<PdfName, PdfStream>();
foreach (PdfName imgRef in xObjects.KeySet()) {
PdfStream previousXobject = xObjects.GetAsStream(imgRef);
PdfDocument imageReplacementDoc =
new PdfDocument(new PdfReader(#"insert.pdf"));
PdfXObject imageReplacement = imageReplacementDoc.GetPage(1).CopyAsFormXObject(pdf);
toReplace[imgRef] = imageReplacement.GetPdfObject();
adjustXObjectSize(imageReplacement);
imageReplacementDoc.Close();
}
foreach (var x in toReplace) {
xObjects.Put(x.Key, x.Value);
}
}
pdf.Close();
UPD: Implementation of adjustXObjectSize(thanks mkl):
private void adjustXObjectSize(PdfXObject pageXObject) {
float scaleXobject = 1 / Math.Max(pageXObject.GetWidth(), pageXObject.GetHeight());
AffineTransform transform = new AffineTransform();
transform.Scale(scaleXobject, scaleXobject);
float[] matrix = new float[6];
transform.GetMatrix(matrix);
pageXObject.GetPdfObject().Put(PdfName.Matrix, new PdfArray(matrix));
}
And the visual result after running the above code on the samples I described would look like this:
Related
I convert the file to PDF using iTextSharp
Creating a new Document and inserting an image into it. How can I transfer width and height of a image to a new pdf? (because the image is compressed)
In the new Document, you can pass the title, keywords, etc. But if you need somewhere to store the old width and height (since the image is compressed), and when reading the PDF to the size of the old pictures. Each page has its own data.
This solution helped. Info contains metadata.
using (var writer = PdfWriter.GetInstance(doc, fs))
{
PdfDictionary info = writer.Info;
PdfName newData = new PdfName($"NewData");
info.Put(newData, new PdfString("any string data"));
}
I'm trying to convert secured PDFs to XPS and back to PDF using FreeSpire and then combine them using iTextSharp. Below is my code snippet for converting various files.
char[] delimiter = { '\\' };
string WorkDir = #"C:\Users\*******\Desktop\PDF\Test";
Directory.SetCurrentDirectory(WorkDir);
string[] SubWorkDir = Directory.GetDirectories(WorkDir);
//convert items to PDF
foreach (string subdir in SubWorkDir)
{
string[] samplelist = Directory.GetFiles(subdir);
for (int f = 0; f < samplelist.Length - 1; f++)
{
if (samplelist[f].EndsWith(".doc") || samplelist[f].EndsWith(".DOC"))
{
Spire.Pdf.PdfDocument doc = new Spire.Pdf.PdfDocument();
doc.LoadFromFile(sampleist[f], FileFormat.DOC);
doc.SaveToFile((Path.ChangeExtension(samplelist[f],".pdf")), FileFormat.PDF);
doc.Close();
}
. //other extension cases
.
.
else if (samplelist[f].EndsWith(".pdf") || sampleList[f].EndsWith(".PDF"))
{
PdfReader reader = new PdfReader(samplelist[f]);
bool PDFCheck = reader.IsOpenedWithFullPermissions;
reader.Close();
if (PDFCheck)
{
Console.WriteLine("{0}\\Full Permisions", Loan_list[f]);
reader.Close();
}
else
{
Console.WriteLine("{0}\\Secured", samplelist[f]);
Spire.Pdf.PdfDocument doc = new Spire.Pdf.PdfDocument();
string path = Loan_List[f];
doc.LoadFromFile(samplelist[f]);
doc.SaveToFile((Path.ChangeExtension(samplelist[f], ".xps")), FileFormat.XPS);
doc.Close();
Spire.Pdf.PdfDocument doc2 = new Spire.Pdf.PdfDocument();
doc2.LoadFromFile((Path.ChangeExtension(samplelist[f], ".xps")), FileFormat.XPS);
doc2.SaveToFile(samplelist[f], FileFormat.PDF);
doc2.Close();
}
The issue is I get a Value cannot be null error in doc.LoadFromFile(samplelist[f]);.I have the string path = sampleList[f]; to check if samplelist[f] was empty but it was not. I tried to replace the samplelist[f] parameter with the variable named path but it also does not go though. I tested the PDF conversion on a smaller scale it it worked (see below)
string PDFDoc = #"C:\Users\****\Desktop\Test\Test\Test.PDF";
string XPSDoc = #"C:\Users\****\Desktop\Test\Test\Test.xps";
//Convert PDF file to XPS file
PdfDocument doc = new PdfDocument();
doc.LoadFromFile(PDFDoc);
doc.SaveToFile(XPSDoc, FileFormat.XPS);
doc.Close();
//Convert XPS file to PDF
PdfDocument doc2 = new PdfDocument();
doc2.LoadFromFile(XPSDoc, FileFormat.XPS);
doc2.SaveToFile(PDFDoc, FileFormat.PDF);
doc2.Close();
I would like to understand why I am getting this error and how to fix it.
There would be 2 solutions for the problem you are facing.
Get the Document in the Document Object not in PDFDocument. And then probably try to SaveToFile Something like this
Document document = new Document();
//Load a Document in document Object
document.SaveToFile("Sample.pdf", FileFormat.PDF);
You can use Stream for the same something like this
PdfDocument doc = new PdfDocument();
//Load PDF file from stream.
FileStream from_stream = File.OpenRead(Loan_list[f]);
//Make sure the Loan_list[f] is the complete path of the file with extension.
doc.LoadFromStream(from_stream);
//Save the PDF document.
doc.SaveToFile(Loan_list[f] + ".pdf",FileFormat.PDF);
Second approach is the easy one, but I would recommend you to use the first one as for obvious reasons like document will give better convertability than stream. Since the document have section, paragraph, page setup, text, fonts everything which need to be required to do a better or exact formatting required.
I'm at the last step in completing a pdf generator. I am using iText sharp and i am able to stamp a base64 image with no problem thanks to help from StackOverflow.
My question is how would I iterate over posted files and add a new page with posted image files on it. Here is my current way of stamping an image... however, its coming from base64. I need to add uploaded images selected from my application to the pdf preferably while the stamper is opened. Just can't seem to make my code work.
I feel this is easy to iterate thru but can't get the logic. Please help:
PdfContentByte pdfContentByte = stamper.GetOverContent(1);
PdfContentByte pdfContentByte2 = stamper.GetOverContent(4);
var image = iTextSharp.text.Image.GetInstance(
Convert.FromBase64String(match.Groups["data"].Value)
);
image.SetAbsolutePosition(270, 90);
image.ScaleToFit(250f, 100f);
pdfContentByte.AddImage(image);
//stamping base64 image works perfect - now i need to stamp the uploaded images onto a new page in the same document before stamper closes.
var imagepath = "//test//";
HttpFileCollection uploadFilCol = HttpContext.Current.Request.Files;
for (int i = 0; i < uploadFilCol.Count; i++)
{
HttpPostedFile file = uploadFilCol[i];
using (FileStream fs = new FileStream(imagepath + "Invoice-" +
HttpContext.Current.Request.Form.Get("genUUID") + file, FileMode.Open))
{
HttpPostedFile file = uploadFilCol[i];
pdfContentByte2.AddImage(file);
}
}
My posted files comes from input form on an html page
<input type="file" id="file" name="files[]" runat="server" multiple />
The basic steps:
Iterate over the HttpFileCollection.
Read each HttpPostedFile into a byte array.
Create iText Image with byte array in previous step.
Set the image absolute position, and optionally scale as needed.
Add image at specified page number with GetOverContent()
A quick snippet to get you started. Not tested, and assumes you have PdfReader, Stream, and PdfStamper setup, along with a working file upload:
HttpFileCollection uploadFilCol = HttpContext.Current.Request.Files;
for (int i = 0; i < uploadFilCol.Count; i++)
{
HttpPostedFile postedFile = uploadFilCol[i];
using (var br = new BinaryReader(postedFile.InputStream))
{
var imageBytes = br.ReadBytes(postedFile.ContentLength);
var image = Image.GetInstance(imageBytes);
// still not sure if you want to add a new blank page, but
// here's how
//stamper.InsertPage(
// APPEND_NEW_PAGE_NUMBER, reader.GetPageSize(APPEND_NEW_PAGE_NUMBER - 1)
//);
// image absolute position
image.SetAbsolutePosition(absoluteX, absoluteY);
// scale image if needed
// image.ScaleAbsolute(...);
// PAGE_NUMBER => add image to specific page number
stamper.GetOverContent(PAGE_NUMBER).AddImage(image);
}
}
How can I remove page breaks from a pdf, so the output would be a single 'page' PDF? So if a normal page is 400x900 and I have 4 pages, a resulting file would be 1600x900. I previously did this for Tif files (Remove page breaks in multi-page tif to make one long page), but would like to do it with PDF. Could I possibly convert to ps, remove whatever code means 'page break', then convert back to pdf?
This can be done in the iTextSharp library by using a single columned PdfTable and dynamically changing the size of the document dependent upon the number of pages.
You'll of course need a few references to the iTextSharp DLL found here
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
Here's a simple example:
public static void MergePages()
{
using (PdfReader reader = new PdfReader(#"C:\Users\cmilne\Desktop\AA0081913.pdf"))//Original PDF containing page breaks.
{
int pages = reader.NumberOfPages;
float postProcessPageHeight = 0;
float postProcessPageWidth = 0;
for (int p = 1; p <= bill.PageCount; p++)
{
var size = bill.PdfReader.GetPageSize(p);
postProcessPageHeight += (size.Height);
if (size.Width > postProcessPageWidth)
postProcessPageWidth = (size.Width);
}
var rect = new Rectangle(postProcessPageWidth, postProcessPageHeight);
using (Document document = new Document(rect, 0, 0, 0, 0))
{
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(#"C:\Users\cmilne\Desktop\AA0081913_NEW.pdf", FileMode.Create)); //Declare location\name of new PDF not containing page breaks.
document.Open();
PdfImportedPage page;
PdfPTable table = new PdfPTable(1);
table.WidthPercentage = 100;
for (int i = 1; i <= pages; i++)
{
page = writer.GetImportedPage(reader, i);
table.AddCell(iTextSharp.text.Image.GetInstance(page));
}
document.Add(table);
document.Close();
}
}
}
The ending page size must be smaller than 14400 by 14400. (This is all that iTextSharp allows) An 8 1/2 x 11 PDF at a common resolution would make the max about 18 pages.
Use the iTextSharp C# library. It gives you a lot of options to manipulate PDFs. I've used it before when I had to write an import application for a closed-source document repository. It worked like a charm. The only downside is their documentation is kind of spotty because they want you to purchase their book. You can browser their Java API though for free since its almost identical to the C#, and just play around with it to find the C# version.
iText: http://itextpdf.com/
I'm using iText to generate a PDF document that consists of several copies of almost the same information.
E.g.: An invoice. One copy is given to the customer, another is filed and a third one is given to an accountant for book-keeping.
All the copies must be exactly the same except for a little piece of text that indicates who is the copy to (Customer, Accounting, File, ...).
There are two possible scenarios (I don't know if the solution is the same for both of them):
a) Each copy goes in a different page.
b) All the copies goes in the same page (the paper will have cutting holes to separete copies).
There will be a wrapper or helper class which uses iText to generate the PDF in order to be able to do something like var pdf = HelperClass.CreateDocument(DocuemntInfo info);. The multiple-copies problem will be solved inside this wrapper/helper.
What does iText provides to accomplish this? Do I need to write each element in the document several times in different positions/pages? Or does iText provide some way to write one copy to the document and then copy it to other position/page?
Note: It's a .Net project, but I tagged the question with both java and c# because this qustion is about how to use iText properly the answer will help both laguage developers.
If each copy goes on a different page, you can create a new document and copy in the page multiple times. Using iText in Java you can do it like this:
// Create output PDF
Document document = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
// Load existing PDF
PdfReader reader = new PdfReader(templateInputStream);
PdfImportedPage page = writer.getImportedPage(reader, 1);
// Copy first page of existing PDF into output PDF
document.newPage();
cb.addTemplate(page, 0, 0);
// Add your first piece of text here
document.add(new Paragraph("Customer"));
// Copy second page of existing PDF into output PDF
document.newPage();
cb.addTemplate(page, 0, 0);
// Add your second piece of text here
document.add(new Paragraph("Accounting"));
// etc...
document.close();
If you want to put all the copies on the same page, the code is similar but instead of using zeroes in addTemplate(page, 0, 0) you'll need to set values for the correct position; the numbers to use depend on the size and shape of your invoice.
See also iText - add content to existing PDF file — the above code is based on the code I wrote in that answer.
Here's how I see this working.
PdfReader reader = new PdfReader( templatePDFPath );
Document doc = new Document();
PdfWriter writer = PdfWriter.createInstance( doc, new FileOutputStream("blah.pdf" ) );
PdfImportedPage inputPage = writer.getImportedPage( reader, 1 );
PdfDirectContent curPageContent = writer.getDirectContent();
String extraStuff[] = getExtraStuff();
for (String stuff : extraStuff) {
curPageContent.saveState();
curPageContent.addTemplate( inputPage /*, x, y*/ );
curPageContent.restoreState();
curPageContent.beginText();
curPageContent.setTextMatrix(x, y);
curPageContent.setFontAndSize( someFont, someSize );
// the actual work:
curPageContent.showText( stuff );
curPageContent.EndText();
// save the contents of curPageContent out to the file and reset it for the next page.
doc.newPage();
}
That's the bare minimum of work on the computer's part. Quite Efficient, and it'll result in a smaller PDF. Rather than having N copies of that page, with tweaks, you have one copy of that page that's reused on N pages, with little tweaks on top.
You could do the same thing, and use the "x,y" parameters in addTemplate to draw them all on the same page. Up to you.
PS: you'll need to figure out the coordinates for setTextMatrix in advance.
You could also use PDfCopy Or PDfSmartCopy to do this.
PdfReader reader = new PdfReader("Path\To\File");
Document doc = new Document();
PdfCopy copier = new PdfCopy(doc, ms1);
//PdfSmartCopy copier = new PdfSmartCopy(doc, ms1);
doc.Open();
copier.CloseStream = false;
PdfImportedPage inputPage = writer.GetImportedPage(reader, 1);
PdfContentByte curPageContent = writer.DirectContent;
for (int i = 0; i < count; i++)
{
copier.AddPage(inputPage);
}
doc.Close();
ms1.Flush();
ms1.Position = 0;
The difference between PdfCopy and PdfSmartCopy is that PdfCopy copies the entire PDF for each page, while PdfSmartCopy outputs a PDF that internally contains only one copy and all pages reference it, resulting in a smaller file and less bandwidth on a network, however it uses more memory on the server and takes longer to process.