ItextSharp Find text in pdf and highlight it

ItextSharp Find text in pdf and highlight it - c#

I am working on PDF functionality, I want to search text in PDF and highlight the found text in the PDF. For that I am using iTextsharp.
I did not get any solution yet, please provide me with a solution.
I have written following code;
public ActionResult Index1()
{
string outputFile = #"D:\Test.pdf";
using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document(PageSize.LETTER))
{
using (PdfWriter w = PdfWriter.GetInstance(doc, fs))
{
doc.Open();
doc.Add(new Paragraph("This is a test and sample pdf for test and wait for it"));
doc.Close();
}
}
}
List<int> pages = new List<int>();
PdfReader pdfReader = new PdfReader(outputFile);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
if (currentPageText.Contains("test"))
{
pages.Add(page);
}
}
pdfReader.Close();
//Create a new file from our test file with highlighting
string highLightFile = #"D:\Test1.pdf";
//Bind a reader and stamper to our test PDF
PdfReader reader = new PdfReader(outputFile);
using (FileStream fs = new FileStream(highLightFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (PdfStamper stamper = new PdfStamper(reader, fs))
{
iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(60.6755f, 749.172f, 94.0195f, 735.3f);
float[] quad = { rect.Left, rect.Bottom, rect.Right, rect.Bottom, rect.Left, rect.Top, rect.Right, rect.Top };
PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad);
//Set the color
highlight.Color = BaseColor.YELLOW;
//Add the annotation
stamper.AddAnnotation(highlight, 1);
}
}
return View();
}
Above code creates one PDF (test1.pdf)
And in another PDF it highlights some text with hard-coded coordinates, I need to find the coordinates of some text in the PDF.
But I could not find the coordinates of the text I'm looking for.

In iText7, we have implemented this functionality.
It can be found in the class RegexBasedLocationExtractionStrategy.
I suggest you have a look to see how it was done, since this functionality was not backported to iText5.

Related

Converting multiple TIFF Images to PDF using iTextSharp

I’m using WebSite in ASP.NET and iTextSharp PDF library. I have a tiff document image that contains 3 pages, I would want to convert all those 3 tiff pages into 1 PDF file with 3 pages.
I try this but it doesn't work as well...
Please tell me what should I do?
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
Document document = new Document();
using (var stream = new FileStream(#"C:\File\0.pdf", FileMode.Create, FileAccess.Write, FileShare.None))
{
PdfWriter.GetInstance(document, stream);
document.Open();
using (var imageStream = new FileStream(#"C:\File\0.tiff", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
var image = iTextSharp.text.Image.GetInstance(imageStream);
document.Add(image);
}
document.Close();
}

// creation of the document with a certain size and certain margins
iTextSharp.text.Document document = new iTextSharp.text.Document(iTextSharp.text.PageSize.A4, 0, 0, 0, 0);
// creation of the different writers
iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(document, new System.IO.FileStream(Server.MapPath("~/App_Data/result.pdf"), System.IO.FileMode.Create));
// load the tiff image and count the total pages
System.Drawing.Bitmap bm = new System.Drawing.Bitmap(Server.MapPath("~/App_Data/source.tif"));
int total = bm.GetFrameCount(System.Drawing.Imaging.FrameDimension.Page);
document.Open();
iTextSharp.text.pdf.PdfContentByte cb = writer.DirectContent;
for (int k = 0; k < total; ++k)
{
bm.SelectActiveFrame(System.Drawing.Imaging.FrameDimension.Page, k);
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(bm, System.Drawing.Imaging.ImageFormat.Bmp);
// scale the image to fit in the page
img.ScalePercent(72f / img.DpiX * 100);
img.SetAbsolutePosition(0, 0);
cb.AddImage(img);
document.NewPage();
}
document.Close();

I just copied the code from this answer, and modified it to your example. So credits go to the person who answered the question in the link.
using (var stream = new FileStream(#"C:\File\0.pdf", FileMode.Create, FileAccess.Write, FileShare.None))
{
Document document = new Document(PageSize.A4, 0, 0, 0, 0);
var writer = PdfWriter.GetInstance(document, stream);
var bitmap = new System.Drawing.Bitmap(#"C:\File\0.tiff");
var pages = bitmap.GetFrameCount(System.Drawing.Imaging.FrameDimension.Page);
document.Open();
iTextSharp.text.pdf.PdfContentByte cb = writer.DirectContent;
for (int i = 0; i < pages; ++i)
{
bitmap.SelectActiveFrame(System.Drawing.Imaging.FrameDimension.Page, i);
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(bitmap, System.Drawing.Imaging.ImageFormat.Bmp);
// scale the image to fit in the page
//img.ScalePercent(72f / img.DpiX * 100);
//img.SetAbsolutePosition(0, 0);
cb.AddImage(img);
document.NewPage();
}
}
document.Close();
}

What I achieved after a long period and based on some searches.
I make a request and the response is PDF or TIFF. The first part is just what I get and how I call the private methods, and that is what you need.
var httpResponse = (HttpWebResponse)(await httpWebRequest.GetResponseAsync());
Stream stream = httpResponse.GetResponseStream();
string contentType = httpResponse.ContentType;
MemoryStream ms = new MemoryStream();
stream.CopyTo(ms);
FileStream file2;
try
{
switch (contentType)
{
case "application/pdf":
{
string outputfile2 = Path.Combine(zipDirectory, "ixosid_" + icxos + ".pdf");
file2 = new FileStream(outputfile2, FileMode.Create, FileAccess.Write);
ms.WriteTo(file2);
file2.Close();
break;
}
default:
{
string outputfile1 = Path.Combine(zipDirectory, "ixosid_" + icxos + ".tiff");
file2 = new FileStream(outputfile1, FileMode.Create, FileAccess.Write);
ms.WriteTo(file2);
file2.Close();
string[] outfilesTiffPages = ConvertTiffToJpeg(outputfile1);
File.Delete(outputfile1);
string outputfile2 = Path.Combine(zipDirectory, "ixosid_" + icxos + ".pdf");
iTextSharp.text.Document doc= AddPicturesToPDF(outfilesTiffPages, outputfile2);
break;
}
}
}
and I have two private methods
private string[] ConvertTiffToJpeg(string fileName)
{
using (System.Drawing.Image imageFile = System.Drawing.Image.FromFile(fileName))
{
FrameDimension frameDimensions = new FrameDimension(
imageFile.FrameDimensionsList[0]);
// Gets the number of pages from the tiff image (if multipage)
int frameNum = imageFile.GetFrameCount(frameDimensions);
string[] jpegPaths = new string[frameNum];
for (int frame = 0; frame < frameNum; frame++)
{
// Selects one frame at a time and save as jpeg.
imageFile.SelectActiveFrame(frameDimensions, frame);
using (Bitmap bmp = new Bitmap(imageFile))
{
jpegPaths[frame] = String.Format(#"{0}\{1}{2}.jpg",
Path.GetDirectoryName(fileName),
Path.GetFileNameWithoutExtension(fileName),
frame);
bmp.Save(jpegPaths[frame], ImageFormat.Jpeg);
}
}
return jpegPaths;
}
}
private iTextSharp.text.Document AddPicturesToPDF(string[] filesPaths, string outputPdf)
{
FileStream fs = new FileStream(outputPdf, FileMode.Create);
Document pdfdoc = new Document();
PdfWriter.GetInstance(pdfdoc, fs);
pdfdoc.Open();
int size = filesPaths.Length;
int count = 0;
foreach(string imagePath in filesPaths)
{
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(imagePath);
img.Alignment = Element.ALIGN_CENTER;
img.SetAbsolutePosition(0, 0);
img.ScaleToFit((PageSize.A4.Width - pdfdoc.RightMargin - pdfdoc.LeftMargin), (PageSize.A4.Height- pdfdoc.BottomMargin - pdfdoc.TopMargin));
pdfdoc.Add(img);
pdfdoc.NewPage();
}
pdfdoc.Close();
return pdfdoc;
}
Maybe I can work with MemoryStream but for now, is working and is what I need.
I searched a lot, and some links that deserve to be mentioned
https://coderedirect.com/questions/178295/convert-tiff-to-jpg-format

Replace text in pdf file using itextshape in c#

am using the below code for replace image and text on pdf, its is good working on the image replace part, but i dont know the how can write the text replace in same code.
private void AddAnImage()
{
string qrfile = qrcode();
using (var inputPdfStream = new FileStream(Server.MapPath("PDF/input.pdf"), FileMode.Open))
using (var inputImageStream = new FileStream(qrfile, FileMode.Open))
using (var outputPdfStream = new FileStream(Server.MapPath("PDF/output.pdf"), FileMode.Create))
{
PdfReader reader = new PdfReader(inputPdfStream);
PdfStamper stamper = new PdfStamper(reader, outputPdfStream);
PdfContentByte pdfContentByte = stamper.GetOverContent(1);
var image = iTextSharp.text.Image.GetInstance(inputImageStream);
image.SetAbsolutePosition(50, 75);
pdfContentByte.AddImage(image);
PdfContentByte canvas = stamper.GetOverContent(2);
ColumnText.ShowTextAligned(canvas, Element.ALIGN_LEFT, new Phrase(DateTime.Now.ToShortDateString()), 0, 0, 0);
stamper.Close();
}
}

Creating a PDF file?

I looked into iTextSharp and SharpPDF and Report.Net as well as PDFSharp.
None of these open source projects have good documentation OR do not work with VS 2012.
Does anyone have a recommended solution or can show me the documentation?
My employer blocks many sites and although Google is not blocked, some of the results are.
I plan on using C# with WinForms and obtaining my data from an Access DB

Hey #Cocoa Dev get this a complete example with diferent functions.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using iTextSharp.text.pdf;
using System.Data;
using System.Text;
using System.util.collections;
using iTextSharp.text;
using System.Net.Mail;
public partial class PDFScenarios : System.Web.UI.Page
{
public string P_InputStream = "~/MyPDFTemplates/ex1.pdf";
public string P_InputStream2 = "~/MyPDFTemplates/ContactInfo.pdf";
public string P_InputStream3 = "~/MyPDFTemplates/MulPages.pdf";
public string P_InputStream4 = "~/MyPDFTemplates/CompanyLetterHead.pdf";
public string P_OutputStream = "~/MyPDFOutputs/ex1_1.pdf";
//Read all 'Form values/keys' from an existing multi-page PDF document
public void ReadPDFformDataPageWise()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
AcroFields form = reader.AcroFields;
try
{
for (int page = 1; page <= reader.NumberOfPages; page++)
{
foreach (KeyValuePair<string, AcroFields.Item> kvp in form.Fields)
{
switch (form.GetFieldType(kvp.Key))
{
case AcroFields.FIELD_TYPE_CHECKBOX:
case AcroFields.FIELD_TYPE_COMBO:
case AcroFields.FIELD_TYPE_LIST:
case AcroFields.FIELD_TYPE_RADIOBUTTON:
case AcroFields.FIELD_TYPE_NONE:
case AcroFields.FIELD_TYPE_PUSHBUTTON:
case AcroFields.FIELD_TYPE_SIGNATURE:
case AcroFields.FIELD_TYPE_TEXT:
int fileType = form.GetFieldType(kvp.Key);
string fieldValue = form.GetField(kvp.Key);
string translatedFileName = form.GetTranslatedFieldName(kvp.Key);
break;
}
}
}
}
catch
{
}
finally
{
reader.Close();
}
}
//Read and alter form values for only second and
//third page of an existing multi page PDF doc.
//Save the changes in a brand new pdf file.
public void ReadAlterPDFformDataInSelectedPages()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
reader.SelectPages("1-2"); //Work with only page# 1 & 2
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(Server.MapPath(P_OutputStream), FileMode.Create)))
{
AcroFields form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
//Replace Address Form field with my custom data
if (fieldKey.Contains("Address"))
{
form.SetField(fieldKey, "MyCustomAddress");
}
}
//The below will make sure the fields are not editable in
//the output PDF.
stamper.FormFlattening = true;
}
}
//Extract text from an existing PDF's second page.
private string ExtractText()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
string txt = PdfTextExtractor.GetTextFromPage(reader, 2, new LocationTextExtractionStrategy());
return txt;
}
//Create a brand new PDF from scratch and without a template
private void CreatePDFNoTemplate()
{
Document pdfDoc = new Document();
PdfWriter writer = PdfWriter.GetInstance(pdfDoc, new FileStream(Server.MapPath(P_OutputStream), FileMode.OpenOrCreate));
pdfDoc.Open();
pdfDoc.Add(new Paragraph("Some data"));
PdfContentByte cb = writer.DirectContent;
cb.MoveTo(pdfDoc.PageSize.Width / 2, pdfDoc.PageSize.Height / 2);
cb.LineTo(pdfDoc.PageSize.Width / 2, pdfDoc.PageSize.Height);
cb.Stroke();
pdfDoc.Close();
}
private void fillPDFForm()
{
string formFile = Server.MapPath(P_InputStream);
string newFile = Server.MapPath(P_OutputStream);
PdfReader reader = new PdfReader(formFile);
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(newFile, FileMode.Create)))
{
AcroFields fields = stamper.AcroFields;
// set form fields
fields.SetField("name", "John Doe");
fields.SetField("address", "xxxxx, yyyy");
fields.SetField("postal_code", "12345");
fields.SetField("email", "johndoe#xxx.com");
// flatten form fields and close document
stamper.FormFlattening = true;
stamper.Close();
}
}
//Helper functions
private void SendEmail(MemoryStream ms)
{
MailAddress _From = new MailAddress("XXX#domain.com");
MailAddress _To = new MailAddress("YYY#a.com");
MailMessage email = new MailMessage(_From, _To);
Attachment attach = new Attachment(ms, new System.Net.Mime.ContentType("application/pdf"));
email.Attachments.Add(attach);
SmtpClient mailSender = new SmtpClient("Gmail-Server");
mailSender.Send(email);
}
private void DownloadAsPDF(MemoryStream ms)
{
Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.AppendHeader("Content-Disposition", "attachment;filename=abc.pdf");
Response.OutputStream.Write(ms.GetBuffer(), 0, ms.GetBuffer().Length);
Response.OutputStream.Flush();
Response.OutputStream.Close();
Response.End();
ms.Close();
}
//Working with Memory Stream and PDF
public void CreatePDFFromMemoryStream()
{
//(1)using PDFWriter
Document doc = new Document();
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
doc.Add(new Paragraph("Some Text"));
writer.CloseStream = false;
doc.Close();
//Get the pointer to the beginning of the stream.
memoryStream.Position = 0;
//You may use this PDF in memorystream to send as an attachment in an email
//OR download as a PDF
SendEmail(memoryStream);
DownloadAsPDF(memoryStream);
//(2)Another way using PdfStamper
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream2));
using (MemoryStream ms = new MemoryStream())
{
PdfStamper stamper = new PdfStamper(reader, ms);
AcroFields fields = stamper.AcroFields;
fields.SetField("SomeField", "MyValueFromDB");
stamper.FormFlattening = true;
stamper.Close();
SendEmail(ms);
}
}
//Burst-- Make each page of an existing multi-page PDF document
//as another brand new PDF document
private void PDFBurst()
{
string pdfTemplatePath = Server.MapPath(P_InputStream3);
PdfReader reader = new PdfReader(pdfTemplatePath);
//PdfCopy copy;
PdfSmartCopy copy;
for (int i = 1; i < reader.NumberOfPages; i++)
{
Document d1 = new Document();
copy = new PdfSmartCopy(d1, new FileStream(Server.MapPath(P_OutputStream).Replace(".pdf", i.ToString() + ".pdf"), FileMode.Create));
d1.Open();
copy.AddPage(copy.GetImportedPage(reader, i));
d1.Close();
}
}
//Copy a set of form fields from an existing PDF template/doc
//and keep appending to a brand new PDF file.
//The copied set of fields will have different values.
private void AppendSetOfFormFields()
{
PdfCopyFields _copy = new PdfCopyFields(new FileStream(Server.MapPath(P_OutputStream), FileMode.Create));
_copy.AddDocument(new PdfReader(a1("1")));
_copy.AddDocument(new PdfReader(a1("2")));
_copy.AddDocument(new PdfReader(new FileStream(Server.MapPath("~/MyPDFTemplates/Myaspx.pdf"), FileMode.Open)));
_copy.Close();
}
//ConcatenateForms
private byte[] a1(string _ToAppend)
{
using (var existingFileStream = new FileStream(Server.MapPath(P_InputStream), FileMode.Open))
using (MemoryStream stream = new MemoryStream())
{
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, stream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
form.RenameField(fieldKey, fieldKey + _ToAppend);
}
// "Flatten" the form so it wont be editable/usable anymore
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
return stream.ToArray();
}
}
//Working with Image
private void AddAnImage()
{
using (var inputPdfStream = new FileStream(#"C:\MyInput.pdf", FileMode.Open))
using (var inputImageStream = new FileStream(#"C:\img1.jpg", FileMode.Open))
using (var outputPdfStream = new FileStream(#"C:\MyOutput.pdf", FileMode.Create))
{
PdfReader reader = new PdfReader(inputPdfStream);
PdfStamper stamper = new PdfStamper(reader, outputPdfStream);
PdfContentByte pdfContentByte = stamper.GetOverContent(1);
var image = iTextSharp.text.Image.GetInstance(inputImageStream);
image.SetAbsolutePosition(1, 1);
pdfContentByte.AddImage(image);
stamper.Close();
}
}
//Add Company Letter-Head/Stationary to an existing pdf
private void AddCompanyStationary()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream2));
PdfReader s_reader = new PdfReader(Server.MapPath(P_InputStream4));
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(Server.MapPath(P_OutputStream), FileMode.Create)))
{
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
int n = reader.NumberOfPages;
PdfContentByte background;
for (int i = 1; i <= n; i++)
{
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
}
stamper.Close();
}
}

Try this example:
using iTextSharp.text;
// Set up the fonts to be used on the pages
private Font _largeFont = new Font(Font.FontFamily.HELVETICA, 18, Font.BOLD, BaseColor.BLACK);
private Font _standardFont = new Font(Font.FontFamily.HELVETICA, 14, Font.NORMAL, BaseColor.BLACK);
private Font _smallFont = new Font(Font.FontFamily.HELVETICA, 10, Font.NORMAL, BaseColor.BLACK);
public void Build()
{
iTextSharp.text.Document doc = null;
try
{
// Initialize the PDF document
doc = new Document();
iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(doc,
new System.IO.FileStream(System.IO.Directory.GetCurrentDirectory() + "\\ScienceReport.pdf",
System.IO.FileMode.Create));
// Set margins and page size for the document
doc.SetMargins(50, 50, 50, 50);
// There are a huge number of possible page sizes, including such sizes as
// EXECUTIVE, LEGAL, LETTER_LANDSCAPE, and NOTE
doc.SetPageSize(new iTextSharp.text.Rectangle(iTextSharp.text.PageSize.LETTER.Width,
iTextSharp.text.PageSize.LETTER.Height));
// Add metadata to the document. This information is visible when viewing the
// document properities within Adobe Reader.
doc.AddTitle("My Science Report");
doc.AddCreator("M. Lichtenberg");
doc.AddKeywords("paper airplanes");
// Add Xmp metadata to the document.
this.CreateXmpMetadata(writer);
// Open the document for writing content
doc.Open();
// Add pages to the document
this.AddPageWithBasicFormatting(doc);
this.AddPageWithInternalLinks(doc);
this.AddPageWithBulletList(doc);
this.AddPageWithExternalLinks(doc);
this.AddPageWithImage(doc, System.IO.Directory.GetCurrentDirectory() + "\\FinalGraph.jpg");
// Add page labels to the document
iTextSharp.text.pdf.PdfPageLabels pdfPageLabels = new iTextSharp.text.pdf.PdfPageLabels();
pdfPageLabels.AddPageLabel(1, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Basic Formatting");
pdfPageLabels.AddPageLabel(2, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Internal Links");
pdfPageLabels.AddPageLabel(3, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Bullet List");
pdfPageLabels.AddPageLabel(4, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "External Links");
pdfPageLabels.AddPageLabel(5, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Image");
writer.PageLabels = pdfPageLabels;
}
catch (iTextSharp.text.DocumentException dex)
{
// Handle iTextSharp errors
}
finally
{
// Clean up
doc.Close();
doc = null;
}
}

You can always just create an html page and then convert that to pdf using wkhtmltopdf. This has the benefit of you not having to construct the pdf with a library such as iText. You just make a text file (html) and then pass it to the wkhtmltopdf executable.
See Calling wkhtmltopdf to generate PDF from HTML for more info.

Pdf Merge/Overlap with iText

I have used iText for some various utility, such us merge and editing of pdf files with success. Now I need to overlap 2 pdf pages:
For Instance:
INPUT:
PDF#1 (1 Page)
PDF#2 (1 Page)
OUTPUT:
PDF#3 (1 Page: This is the result of the 2 Input Pages Overlapped)
I don't know if it's possible to do this with iText latest version. I am also considering to use one of the 2 input PDF Files as background for the PDF Output Files.
Thank you in advance.

It's actually pretty easy to do. The PdfWriter object has an instance method called GetImportedPage() which returns a PdfImportedPage object. This object can be passed to a PdfContentByte's AddTemplate() method.
GetImportedPage() takes a PdfReader object and the page number that you want to get. You can get a PdfContentByte from an instance of a PdfWriter's DirectContent property.
The code below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.2.0 that shows this all off. It first creates two files on the desktop, the first with just a solid red background color and the second with just a paragraph. It then combines those two files overlapping into a third document. See the code for additional comments.
using System;
using System.IO;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.pdf;
namespace WindowsFormsApplication1 {
public partial class Form1 : Form {
public Form1() {
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e) {
//Folder that we'll work from
string workingFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
string pdf1 = Path.Combine(workingFolder, "pdf1.pdf");//PDF with solid red background color
string pdf2 = Path.Combine(workingFolder, "pdf2.pdf");//PDF with text
string pdf3 = Path.Combine(workingFolder, "pdf3.pdf");//Merged PDF
//Create a basic PDF filled with red, nothing special
using (FileStream fs = new FileStream(pdf1, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
PdfContentByte cb = writer.DirectContent;
cb.SetColorFill(BaseColor.RED);
cb.Rectangle(0, 0, doc.PageSize.Width, doc.PageSize.Height);
cb.Fill();
doc.Close();
}
}
}
//Create a basic PDF with a single line of text, nothing special
using (FileStream fs = new FileStream(pdf2, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
doc.Add(new Paragraph("This is a test"));
doc.Close();
}
}
}
//Create a basic PDF
using (FileStream fs = new FileStream(pdf3, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Get page 1 of the first file
PdfImportedPage imp1 = writer.GetImportedPage(new PdfReader(pdf1), 1);
//Get page 2 of the second file
PdfImportedPage imp2 = writer.GetImportedPage(new PdfReader(pdf2), 1);
//Add the first file to coordinates 0,0
writer.DirectContent.AddTemplate(imp1, 0, 0);
//Since we don't call NewPage the next call will operate on the same page
writer.DirectContent.AddTemplate(imp2, 0, 0);
doc.Close();
}
}
}
this.Close();
}
}
}

ITextSharp insert text to an existing pdf

The title sums it all.
I want to add a text to an existing PDF file using iTextSharp, however i can't find how to do it anywhere in the web...
PS: I cannot use PDF forms.

I found a way to do it (dont know if it is the best but it works)
string oldFile = "oldFile.pdf";
string newFile = "newFile.pdf";
// open the reader
PdfReader reader = new PdfReader(oldFile);
Rectangle size = reader.GetPageSizeWithRotation(1);
Document document = new Document(size);
// open the writer
FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write);
PdfWriter writer = PdfWriter.GetInstance(document, fs);
document.Open();
// the pdf content
PdfContentByte cb = writer.DirectContent;
// select the font properties
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252,BaseFont.NOT_EMBEDDED);
cb.SetColorFill(BaseColor.DARK_GRAY);
cb.SetFontAndSize(bf, 8);
// write the text in the pdf content
cb.BeginText();
string text = "Some random blablablabla...";
// put the alignment and coordinates here
cb.ShowTextAligned(1, text, 520, 640, 0);
cb.EndText();
cb.BeginText();
text = "Other random blabla...";
// put the alignment and coordinates here
cb.ShowTextAligned(2, text, 100, 200, 0);
cb.EndText();
// create the new page and add it to the pdf
PdfImportedPage page = writer.GetImportedPage(reader, 1);
cb.AddTemplate(page, 0, 0);
// close the streams and voilá the file should be changed :)
document.Close();
fs.Close();
writer.Close();
reader.Close();
I hope this can be usefull for someone =) (and post here any errors)

In addition to the excellent answers above, the following shows how to add text to each page of a multi-page document:
using (var reader = new PdfReader(#"C:\Input.pdf"))
{
using (var fileStream = new FileStream(#"C:\Output.pdf", FileMode.Create, FileAccess.Write))
{
var document = new Document(reader.GetPageSizeWithRotation(1));
var writer = PdfWriter.GetInstance(document, fileStream);
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
var importedPage = writer.GetImportedPage(reader, i);
var contentByte = writer.DirectContent;
contentByte.BeginText();
contentByte.SetFontAndSize(baseFont, 12);
var multiLineString = "Hello,\r\nWorld!".Split('\n');
foreach (var line in multiLineString)
{
contentByte.ShowTextAligned(PdfContentByte.ALIGN_LEFT, line, 200, 200, 0);
}
contentByte.EndText();
contentByte.AddTemplate(importedPage, 0, 0);
}
document.Close();
writer.Close();
}
}

This worked for me and includes using OutputStream:
PdfReader reader = new PdfReader(new RandomAccessFileOrArray(Request.MapPath("Template.pdf")), null);
Rectangle size = reader.GetPageSizeWithRotation(1);
using (Stream outStream = Response.OutputStream)
{
Document document = new Document(size);
PdfWriter writer = PdfWriter.GetInstance(document, outStream);
document.Open();
try
{
PdfContentByte cb = writer.DirectContent;
cb.BeginText();
try
{
cb.SetFontAndSize(BaseFont.CreateFont(), 12);
cb.SetTextMatrix(110, 110);
cb.ShowText("aaa");
}
finally
{
cb.EndText();
}
PdfImportedPage page = writer.GetImportedPage(reader, 1);
cb.AddTemplate(page, 0, 0);
}
finally
{
document.Close();
writer.Close();
reader.Close();
}
}

Here is a method that uses stamper and absolute coordinates showed in the different PDF clients (Adobe, FoxIt and etc. )
public static void AddTextToPdf(string inputPdfPath, string outputPdfPath, string textToAdd, System.Drawing.Point point)
{
//variables
string pathin = inputPdfPath;
string pathout = outputPdfPath;
//create PdfReader object to read from the existing document
using (PdfReader reader = new PdfReader(pathin))
//create PdfStamper object to write to get the pages from reader
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(pathout, FileMode.Create)))
{
//select two pages from the original document
reader.SelectPages("1-2");
//gettins the page size in order to substract from the iTextSharp coordinates
var pageSize = reader.GetPageSize(1);
// PdfContentByte from stamper to add content to the pages over the original content
PdfContentByte pbover = stamper.GetOverContent(1);
//add content to the page using ColumnText
Font font = new Font();
font.Size = 45;
//setting up the X and Y coordinates of the document
int x = point.X;
int y = point.Y;
y = (int) (pageSize.Height - y);
ColumnText.ShowTextAligned(pbover, Element.ALIGN_CENTER, new Phrase(textToAdd, font), x, y, 0);
}
}

Here is a method To print over images:
taken from here.
Use a different layer for your text you're putting over the images, and also make sure to use the GetOverContent() method.
string oldFile = "FileWithImages.pdf";
string watermarkedFile = "Layers.pdf";
// Creating watermark on a separate layer
// Creating iTextSharp.text.pdf.PdfReader object to read the Existing PDF Document
PdfReader reader1 = new PdfReader(oldFile);
using (FileStream fs = new FileStream(watermarkedFile, FileMode.Create, FileAccess.Write, FileShare.None))
// Creating iTextSharp.text.pdf.PdfStamper object to write Data from iTextSharp.text.pdf.PdfReader object to FileStream object
using (PdfStamper stamper = new PdfStamper(reader1, fs))
{
// Getting total number of pages of the Existing Document
int pageCount = reader1.NumberOfPages;
// Create New Layer for Watermark
PdfLayer layer = new PdfLayer("Layer", stamper.Writer);
// Loop through each Page
for (int i = 1; i <= pageCount; i++)
{
// Getting the Page Size
Rectangle rect = reader1.GetPageSize(i);
// Get the ContentByte object
PdfContentByte cb = stamper.GetOverContent(i);
// Tell the cb that the next commands should be "bound" to this new layer
cb.BeginLayer(layer);
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
cb.SetColorFill(BaseColor.RED);
cb.SetFontAndSize(bf, 100);
cb.BeginText();
cb.ShowTextAligned(PdfContentByte.ALIGN_CENTER, "Some random blablablabla...", rect.Width / 2, rect.Height / 2, - 90);
cb.EndText();
// Close the layer
cb.EndLayer();
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

ItextSharp Find text in pdf and highlight it - c#

In iText7, we have implemented this functionality. It can be found in the class RegexBasedLocationExtractionStrategy. I suggest you have a look to see how it was done, since this functionality was not backported to iText5.

Related

Converting multiple TIFF Images to PDF using iTextSharp

Replace text in pdf file using itextshape in c#

Creating a PDF file?

Pdf Merge/Overlap with iText

ITextSharp insert text to an existing pdf

Categories

Resources