Fill PDF Form with Itextsharp

Fill PDF Form with Itextsharp - c#

I am trying to fill up a form with ITextsharp, and trying out the following code to get all the fields in the pdf:
string pdfTemplate = #"c:\Temp\questionnaire.pdf";
PdfReader pdfReader = new PdfReader(pdfTemplate);
StringBuilder sb = new StringBuilder();
foreach (var de in pdfReader.AcroFields.Fields)
{
sb.Append(de.Key.ToString() + Environment.NewLine);
}
But the foreach loop is always null count. Do I need to do something to file itself as I have tried the example from here and it works fine... this is an example of pdf I am trying to fill
any ideas?
Edit ::

As it turned out, the PDF "form" to fill in actually wasn't a form (in PDF terms) at all. Thus, your have two choices:
You add the text to the page contents directly using hardcoded or configured "field" positions and dimensions as described by #tschmit007 in comments to his answer.
You add actual PDF form fields to your PDF to generate a true PDF form which you take as template to fill in later.
You can add actual form fields either using some graphical tool allowing that, e.g. Adobe Acrobat, or you can use iText(Sharp). Have a look at chapter 8 of iText in Action — 2nd Edition and the samples available here for Java and here for .Net.
Those samples mostly add form fields to newly generated PDF documents. You can virtually use the same code, though, for adding form fields to a PdfStamper which exposes its inner PdfWriter using stamper.getWriter() in Java and the stamper.Writer in C#. Instead of writer.addAnnotation(field) you have to use stamper.addAnnotation(field, page), though.

try:
using (FileStream outFile = new FileStream("result.pdf", FileMode.Create)) {
PdfReader pdfReader = new PdfReader("file.pdf");
PdfStamper pdfStamper = new PdfStamper(pdfReader, outFile);
AcroFields fields = pdfStamper.AcroFields;
//rest of the code here
//fields.SetField("n°1", "value");
//...
pdfStamper.Close();
pdfReader.Close();
}

Related

Unable to read text in a specific location in a pdf file using iTextSharp

I'm given to read a pdf texts and do some stuffs are extracting the texts. I 'm using iTextSharp to read the PDF. The problem here is that the PdfTextExtractor.GetTextFromPage doesnt give me all the contents of the page. For ex
In the above PDF I m unable to read texts that are highlighted in blue. Rest of the characters I m able t read. Below is the line that does the above
`string filePath = "myFile path";
PdfReader pdfReader = new PdfReader(filePath);
for (int page = 1; page<=1; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
}`
Any suggestions here?
I have went through lots of queries and solution in SO but not specific to this query.

The reason for text extraction not extracting those texts is pretty simple: Those texts are not part of the static page content but form fields! But "Text extraction" in iText (and other PDF libraries I know, too) is considered to mean "extraction of the text of the static page content". Thus, those texts you miss simply are not subject to text extraction.
If you want to make form field values subject to your text extraction code, too, you first have to flatten the form field visualizations. "Flattening" here means making them part of the static page content and dropping all their form field dynamics.
You can do that by adding after reading the PDF in this line
PdfReader pdfReader = new PdfReader(filePath);
code to flatten this PDF and loading the flattened PDF into the pdfReader, e.g. like this:
MemoryStream memoryStream = new MemoryStream();
PdfStamper pdfStamper = new PdfStamper(pdfReader, memoryStream);
pdfStamper.FormFlattening = true;
pdfStamper.Writer.CloseStream = false;
pdfStamper.Close();
memoryStream.Position = 0;
pdfReader = new PdfReader(memoryStream);
Extracting the text from this re-initialized pdfReader will give you the text from the form fields, too.
Unfortunately, the flattened form text is added at the end of the content stream. As your chosen text extraction strategy SimpleTextExtractionStrategy simply returns the text in the order it is drawn, the former form fields contents all are extracted at the end.
You can change this by using a different text extraction strategy, i.e. by replacing this line:
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
Using the LocationTextExtractionStrategy (which is part of the iText distribution) already returns a better result; unfortunately the form field values are not exactly on the same base line as the static contents we perceive to be on the same line, so there are some unexpected line breaks.
ITextExtractionStrategy strategy = new LocationTextExtractionStrategy();
Using the HorizontalTextExtractionStrategy (from this answer which contains both a Java and a C# version thereof) the result is even better. Beware, though, this strategy is not universally better, read the warnings in the answer text.
ITextExtractionStrategy strategy = new HorizontalTextExtractionStrategy();

After using PdfStamper my custom fonts are replaced to Arial

I have an app in C# with itextsharp that gets some PDF templates, fill some fields and export a flattened version.
Those PDF templates have some custom fonts, one font that I'm using it's DINPro-Regular.otf but after I replace the text and save it my font is being replaced to an Arial font.
I've read other threads but I couldn't find something that solve my case..
My code is:
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
// PdfStamper, which will create
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
// Set some info
form.SetField("part_number", part.PartNumber.ToUpper());
...
// "Flatten" the form so it wont be editable/usable anymore
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
I've tried to add those lines as well but didn't make any difference:
form.GenerateAppearances = false;
stamper.FreeTextFlattening = true;
Original PDF
Generated PDF
So if anyone had a similar problem or have a clue how to solve it will be very appreciated.

Disable extended features with iTextSharp

I have a PDF template with a form with the Extended features enabled. After filling in the fields of this form using iTextSharp, a user with acrobat reader gets the error message:
This document enabled extended features in Adobe Reader. The document has
been changed since it was created and use of extended features is no longer
available. Please contact the author for the original version of this
document.
I googled a bit but all the posts talk about "enabling" extended features, however, I want the form fields to remain disabled and extended features turned off
Here a sample code which I am using:
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
// PdfStamper, which will create
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
if (fieldKey.Equals("Retailer Name"))
form.SetField(fieldKey, retailerName);
}
// “Flatten” the form so it wont be editable/usable anymore
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
}

The links here are dead as the iTextPdf web site has been completely revamped. But the answer can be understood without those links, too.
The iText Keyword: Reader enabled PDFs points to the following information:
Submitted by Bruno Lowagie on Fri, 12/31/2010 - 16:37
After filling out my form, my PDF shows the following message: This document enabled extended features in Adobe Reader. The document has been changed since it was created and use of extended features is no longer available. Please contact the author for the original version of this document. How do I avoid this message?
The creator of the form made the document Reader enabled. Reader enabling can only be done using Adobe software. You can avoid this message in two ways:
Remove the usage rights. This will result in a form that is no longer Reader enabled. For instance: if the creator of the document allowed that the filled out form could be saved locally, this will no longer be possible after removing the usage rights.
Fill out the form in append mode. This will result in a bigger file size, but Reader enabling will be preserved.
It also points to the sample ReaderEnabledForm.java (the C#/iTextSharp equivalent of which is ReaderEnabledForm.cs) which shows how to do either.
In your case this amounts to calling
pdfReader.RemoveUsageRights();
right after creating the PdfReader and before creating the PdfStamper.
/**
* Removes any usage rights that this PDF may have. Only Adobe can grant usage rights
* and any PDF modification with iText will invalidate them. Invalidated usage rights may
* confuse Acrobat and it's advisabe to remove them altogether.
*/
public void RemoveUsageRights()

Fill out the form in append mode by using the PdfStamper constractor overload
// PdfStamper, which will create
var stamper = new PdfStamper(pdfReader, fileStream, '\0', true);

Appending PDFs to generated PDF using iTextSharp

I am generating a pdf using iTextSharp. If certain properties are true then I also want to insert an existing pdf with static content.
private byte[] GeneratePdf(DraftOrder draftOrder)
// create a pdf document
var document = new Document();
// set the page size, set the orientation
document.SetPageSize(PageSize.A4);
// create a writer instance
var pdfWriter = PdfWriter.GetInstance(document, new FileStream(file, FileMode.Create));
document.Open();
if(draftOrder.hasProperty){
//add these things to the pdf
var textToBeAdded = "<table><tr>....</table>";
}
FormatHtml(document, textToBeAdded , css);
if(someOtherProperty){
//add static pdf from file
document.NewPage();
var reader = new PdfReader("myPath/existing.pdf");
PdfImportedPage page;
for(var i = 0; i < reader.NumberOfPages; i++){
//It's this bit I don't really understand
//**how can I add the page read to the document being created?**
}
I can load the pdf from the source but when I iterate over the pages I can't seem to be able to add them to the document I am creating.
Cheers

Please read http://manning.com/lowagie2/samplechapter6.pdf
If you don't mind losing all interactivity, you can get the template from the writer object with the GetImportedPage() method and add it to the document with AddTemplate ().
This question has been answered many times on StackOverflow and you'll notice that I always warn about some dangers: you need to realize that the dimensions of the imported page can be different from the page size you initially defined. Because of this invisible parts of the imported page can become visible; visible parts can become invisible.
I'd prefer adding the extra page in a second ho using PdfCopy, but maybe that's just me.

Set different parts of a form field to have different fonts using iTextSharp

I'm not sure that this is possible but I figured it would be worth asking. I have figured out how to set the font of a formfield using the pdfstamper and acrofields methods but I would really like to be able to set the font of different parts of the text in the same field. Here's how I'm setting the font of the form fields currently:
// Use iTextSharp PDF Reader, to get the fields and send to the
//Stamper to set the fields in the document
PdfReader pdfReader = new PdfReader(fileName);
// Initialize Stamper (ms is a MemoryStream object)
PdfStamper pdfStamper = new PdfStamper(pdfReader, ms);
// Get Reference to PDF Document Fields
AcroFields pdfFormFields = pdfStamper.AcroFields;
//create a bold font
iTextSharp.text.Font bold = FontFactory.GetFont(FontFactory.COURIER, 8f, iTextSharp.text.Font.BOLD);
//set the field to bold
pdfFormFields.SetFieldProperty(nameOfField, "textfont", bold.BaseFont, null);
//set the text of the form field
pdfFormFields.SetField(nameOfField, "This: Will Be Displayed In The Field");
// Set the flattening flag to false, so the document can continue to be edited
pdfStamper.FormFlattening = true;
// close the pdf stamper
pdfStamper.Close();
What I'd like to be able to do where I set the text above is set the "This: " to bold and leave the "Will Be Displayed In The Field" non-bolded. I'm not sure this is actually possible but I figured it was worth asking because it would really be helpful in what I'm currently working on.
Thanks in advance!

Yes, kinda. PDF fields can have a rich text value (since acrobat 6/pdf1.5) along with a regular value.
The regular value uses the font defined in the default appearances... a single font.
The rich value format (which iText doesn't support directly, at least not yet), is described in chapter 12.7.3.4 of the PDF Reference. <b>, <i>, <p>, and quite a few css2 text attributes. It requires a with various attributes.
To enable rich values, you have to set bit 26 of the field flags (PdfName.FF) for a text field. PdfFormField doesn't have a "setRichValue", but they're dictionaries, so you can just:
myPdfFormField.put(PdfName.RV, new PdfString( richTextValue ) );
If you're trying to add rich text to an existing field that doesn't already support it:
AcroFields fields = stamper.getAcroFields();
AcroFields.Item fldItem = fields.getFieldItem(fldName);
PdfDictionary mergedDict = item.getMerged(0);
int flagVal = mergedDict.getAsNumber(PdfName.FF).intValue();
flagVal |= (1 << 26);
int writeFlags = AcroFields.Item.WRITE_MERGED | AcroFields.Item.WRITE_VALUE;
fldItem.writeToAll(PdfName.FF, new PdfNumber(flagVal), writeFlags);
fldItem.writeToAll(PdfName.RV, new PdfString(richTextValue), writeFlags);
I'm actually adding rich text support to iText (not sharp) as I type this message. Hurray for contributors on SO. Paulo's been good about keeping iTextSharp in synch lately, so that shouldn't be an issue. The next trunk release should have this feature... so you'd be able to write:
myPdfFormField.setFieldFlags( PdfFormField.FF_RICHTEXT );
myPdfFormField.setRichValue( richTextValue );
or
// note that this will fail unless the rich flag is set
acroFields.setFieldRichValue( richTextValue );
NOTE: iText's appearance generation hasn't been updated, just the value side of things. That would take considerably more work. So you'll want to acroFields.setGenerateAppearances(false) or have JS that resets the field value when the form its opened to force Acrobat/Reader to build the appearance[s] for you.

It took me some time to figure out after richtextfield did not work the way it was suppose too with
acrofields and most of the cases were the pdf was not editable with different fonts at runtime.
I worked out way of setting different fonts in acrofields and passing values and editing at runtime with itextsharp and I thought it will be useful for others.
Create pdf with a text field in PDF1.pdf (I hope you know how to create field in pdf)
e.g., txtComments
Go to the property section and Set the property to richtext,Mulitiline
Format the text content in word or pdf by adding the fonts and colors.
If done in word, copy and paste the content in pdf - txtcomments field.
Note:
If you want to add dynamic content. Set the parameter “{0}” to the txtComment field in the pdf.
using string format method you can set values to it.This is shown in the code below.
e.g., "Set different parts of a form field to have different fonts using {0}"
Add the following code in a button (this is not specific) event by reference in the itextsharp.dll 5.4.2
Response.ContentType = "application/pdf";
Response.AddHeader("Content-disposition","attachment; filename=your.pdf");
PdfReader reader = new PdfReader(#"C:\test\pdf1.pdf");
PdfStamper stamp = new PdfStamper(reader, Response.OutputStream);
AcroFields field = pdfStamp.AcroFields;
string comments = field .GetFieldRichValue("txtcomments");
string Name = "Test1";
string value = string.Format(comments,Name);
field.SetField("txtComment", value );
field.GenerateAppearances = false;//Pdf knows what to do;
stamp.FormFlattening = false;//available for edit at run time
stamp.FreeTextFlattening = true;
stamp.Close();
reader.Close()
Hope this helps.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Fill PDF Form with Itextsharp - c#

Related

Unable to read text in a specific location in a pdf file using iTextSharp

After using PdfStamper my custom fonts are replaced to Arial

Disable extended features with iTextSharp

Appending PDFs to generated PDF using iTextSharp

Set different parts of a form field to have different fonts using iTextSharp

Categories

Resources