Copy Form Fields From One PDF to Another

Copy Form Fields From One PDF to Another - c#

I have a situation where I need to copy all of the form fields from one PDF to another. The purpose is to automate the overlaying of the fields when small edits are made to the underlying Word pages.
I've been using the trial version of Aspose.Pdf.Kit, and I'm able to copy everything but Radio buttons to a new form. However Aspose doesn't support copying the radio buttons, which completely nullifies it's usefulness, not to mention their customer support has been subpar.
In any event, I'm looking for some sort of library or plug-in that does support copying all types of form fields.
Does anyone have any ideas?
Thanks,
~DJ

Yes, it is possible. No, setField() won't do the trick... madisonw's code will copy the field values, but not the fields themselves.
OTOH, it really isn't that hard.
Something like:
PdfReader currentReader = new PdfReader( CURRENT_PDF_PATH ); // throws
PdfReader pdfFromWord = new PdfReader( TWEAKED_PDF_FROM_WORD_PATH ); // throws
PdfStamper stamper = new PdfStamper( currentReader , outputFile ); //throws
for( int i = 1; i <= tempalteReader.getNumberOfPages(); ++i) {
stamper.replacePage( pdfFromWord, i, i );
}
stamper.close(); // throws
I'm ignoring a bunch of exceptions, and am writing in Java, but C# should look virtually identical.
Also, this code ignores the case where someone ADDS A PAGE... which would get quite thorny. Was it added before or after the pages with fields on them? Did those pages reflow at all, requiring you to move the fields? At that point you really need a manual process with Acrobat Pro.

I agree with Oded, iTextSharp should be able to do the job. I've used code similar the following snippet and never had problems with any field types. I'm sure there must have been a radio button in the mix.
private void CopyFields(PdfStamper targetFile, PdfReader sourceFile){
{
foreach (DictionaryEntry de in targetFile.AcroFields.Fields)
{
string fieldName = de.Key.ToString();
target.AcroFields.SetField(fieldName, sourceFile.AcroFields.GetField(fieldName));
}
}

Related

How to fill forms like this using iText for .NET

Trying to fill name and address on each boxes using
cb.SetTextMatrix(x, y);// x and y positions .
cb.ShowText("data");
But fails to do so .
the problem

The code you are using isn't entirely incorrect, but it has several flaws. For starters: you don't know the value of the x and y parameters, and that's kind of crucial if you want the text to be in the correct position.
Also: you are writing PDF syntax directly into the content stream. In your snippet, you forgot to create the text object (with cb.BeginText() and cb.EndText()). If you are new at PDF, you shouldn't try writing PDF syntax directly into the content stream unless you have a solid understanding of ISO-32000-1. Have you ever read ISO-32000-1? If not, then why are you using low-level operations? That doesn't make much sense, does it? There are helper classes such as ColumnText to add content at absolute positions.
Looking at the screen shot you shared, I see that some fields require "Comb" functionality. This functionality makes sure that each small box contains exactly one glyph (if you don't know what a glyph is, think of it as the visual representation of a character).
If you want to make it easy on yourself, you should test if the form is interactive first. Answer this question:
Does the form contain AcroFields?
If the answer is "Yes", fill out the form using the AcroFields object. You can find out which field names to use by following the instructions in the answer to this question: How do I enumerate all the fields in a PDF file in ITextSharp ?
If the answer is "No", open the file in Adobe Acrobat, and manually add fields. Define the fields as Comb fields so that each box contains each glyph. To get a nice-looking result, select a monospaced font such as Courier (using a proportional font will probably give you an uglier result). This operation adds AcroForm fields.
Once you have an interactive form with AcroFields (assuming you have defined them correctly), filling out the form is as easy as this in iText 5:
PdfReader reader = new PdfReader(template);
PdfStamper stamper = new PdfStamper(reader,
new FileStream(newFile, FileMode.Create));
AcroFields form = stamper.AcroFields;
form.SetField(key1, value1);
form.SetField(key2, value2);
form.SetField(key3, value3);
...
stamper.Close();
See How to create and fill out a PDF form
However, since you are new at all of this, I recommend that you use iText 7 as described in the jump-start tutorial:
PdfDocument pdf = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
PdfAcroForm form = PdfAcroForm.GetAcroForm(pdf, true);
IDictionary<String, PdfFormField> fields = form.GetFormFields();
PdfFormField toSet;
fields.TryGetValue(key1, out toSet);
toSet.SetValue(value1);
fields.TryGetValue(key2, out toSet);
toSet.SetValue(value2);
...
pdf.Close();
If you want to remove the interactivity after filling out the form, you need to flatten the fields. This is also documented on the official web site.

How do I use C# to fill out a Word document?

I have a Word document, letter.docx, that is a letter I intend to mail to hundreds of people for a party. The letter is already composed and has been formatted in its own special way with varying type sizes and fonts. It's set and ready to go, with placeholders where I have to fill out variables that change like Name, Address, phone number, etc.
Now, I would like to write a C# program where a user can type in variable things like Name, Address, etc., into a form, hit a button, and produce letter.docx with the right information filled in at the right places.
I understand Word has features that allow you do this, but I really want to do this in C#.

Of course you can do it. Use Microsoft.Office.Interop.Word reference in your project.
First bookmark all the fields you want to be updated in the document from 'insert' tab (eg. NameField is bookmarked with tag 'name_field'). Then, in your C# code add the following:
Microsoft.Office.Interop.Word.Application wordApp = null;
wordApp = new Microsoft.Office.Interop.Word.Application();
wordApp.Visible = true;
Document wordDoc = wordApp.Documents.Open(#"C:\test.docx");
Bookmark bkm = wordDoc.Bookmarks["name_field"];
Microsoft.Office.Interop.Word.Range rng = bkm.Range;
rng.Text = "Adams Laura"; //Get value from any where
Remember to properly save & close the document.(You can see this)

I don't know of anything built into the language, but the example here seems to do exactly what you want.
If you can provide specific examples of what you want to do (are the placeholders Fields? specifically name bits of text?), I can probably give you a more refined answer that directly targets your problem.

Word Provides COM objects that one can make use of in C#
Add a reference to the Microsoft office interop under the COM tab in the add reference dialog
Also, see this question:
Filling in FIelds in work using C#

I had a situation where I needed to fill out some MS Word forms, so I used something similar to the following code (make sure you reference Microsoft.Office.Interop.Word; I used version 14, but you should adjust it to your own scenario):
// FormData is a custom container type that holds data... you'll have your own.
public static void FillOutForm(FormData data)
{
var app = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document doc = null;
try
{
var filePath = "Your file path.";
doc = app.Documents.Add(filePath);
doc.Activate();
// Loop over the form fields and fill them out.
foreach(Microsoft.Office.Interop.Word.FormField field in doc.FormFields)
{
switch (field.Name)
{
// Text field case.
case "textField1":
field.Range.Text = data.SomeText;
break;
// Check box case.
case "checkBox1":
field.CheckBox.Value = data.IsSomethingTrue;
break;
default:
// Throw an error or do nothing.
break;
}
}
// Save a copy.
var newFilePath = "Your new file path.";
doc.SaveAs2(newFilePath);
}
catch (Exception e)
{
// Perform your error logging and handling here.
}
finally
{
// Make sure you close things out.
// I tend not to save over the original form, so I wouldn't save
// changes to it -- hence the option I chose here.
doc.Close(
Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
app.Quit();
}
}
As you can see, it's really not that hard at all. There are some other options on forms, so you'll have to research them, but the most general ones, the check box and the text box, are the ones I demonstrated here. If you didn't create a form, I suggest going through and making sure that you know all the fields, as that's what you'll need for this.

Prevent Word document's fields from updating when opened

I wrote a utility for another team that recursively goes through folders and converts the Word docs found to PDF by using Word Interop with C#.
The problem we're having is that the documents were created with date fields that update to today's date before they get saved out. I found a method to disable updating fields before printing, but I need to prevent the fields from updating on open.
Is that possible? I'd like to do the fix in C#, but if I have to do a Word macro, I can.

As described in Microsoft's endless maze of documentation you can lock the field code. For example in VBA if I have a single date field in the body in the form of
{DATE \# "M/d/yyyy h:mm:ss am/pm" \* MERGEFORMAT }
I can run
ActiveDocument.Fields(1).Locked = True
Then if I make a change to the document, save, then re-open, the field code will not update.
Example using c# Office Interop:
Word.Application wordApp = new Word.Application();
Word.Document wordDoc = wordApp.ActiveDocument;
wordDoc.Fields.Locked = 1; //its apparently an int32 rather than a bool
You can place the code in the DocumentOpen event. I'm assuming you have an add-in which subscribes to the event. If not, clarify, as that can be a battle on its own.
EDIT: In my testing, locking fields in this manner locks them across all StoryRanges, so there is no need to get the field instances in headers, footers, footnotes, textboxes, ..., etc. This is a surprising treat.

Well, I didn't find a way to do it with Interop, but my company did buy Aspose.Words and I wrote a utility to convert the Word docs to TIFF images. The Aspose tool won't update fields unless you explicitly tell it to. Here's a sample of the code I used with Aspose. Keep in mind, I had a requirement to convert the Word docs to single page TIFF images and I hard-coded many of the options because it was just a utility for myself on this project.
private static bool ConvertWordToTiff(string inputFilePath, string outputFilePath)
{
try
{
Document doc = new Document(inputFilePath);
for (int i = 0; i < doc.PageCount; i++)
{
ImageSaveOptions options = new ImageSaveOptions(SaveFormat.Tiff);
options.PageIndex = i;
options.PageCount = 1;
options.TiffCompression = TiffCompression.Lzw;
options.Resolution = 200;
options.ImageColorMode = ImageColorMode.BlackAndWhite;
var extension = Path.GetExtension(outputFilePath);
var pageNum = String.Format("-{0:000}", (i+1));
var outputPageFilePath = outputFilePath.Replace(extension, pageNum + extension);
doc.Save(outputPageFilePath, options);
}
return true;
}
catch (Exception ex)
{
LogError(ex);
return false;
}
}

I think a new question on SO is appropriate then, because this will require XML processing rather than just Office Interop. If you have both .doc and .docx file types to convert, you might require two separate solutions: one for WordML (Word 2003 XML format), and another for OpenXML (Word 2007/2010/2013 XML format), since you cannot open the old file format and save as the new without the fields updating.
Inspecting the OOXML of a locked field shows us this w:fldLock="1" attribute. This can be inserted using appropriate XML processing against the document, such as through the OOXML SDK, or through a standard XSLT transform.
Might be helpful: this how-do-i-unlock-a-content-control-using-the-openxml-sdk-in-a-word-2010-document question might be similar situation but for Content Controls. You may be able to apply the same solution to Fields, if the the Lock and LockingValues types apply the same way to fields. I am not certain of this however.
To give more confidence that this is the way to do it, see example of this vendor's solution for the problem. If you need to develop this in-house, then openxmldeveloper.org is a good place to start - look for Eric White's examples for manipulating fields such as this.

How do I text wrap to next acrofield in iTextSharp?

how do I get text to wrap from one acrofield to the next? I have an adobe pdf doc our client gave us. It has acro fields one atop another (all with the same name). They want the text to wrap from one to another when it reaches the end of the line. All the other examples I see out there do not deal with filling in acro fields that wrap. Please help!
// loop through disabilities and display them
foreach (var disability in formNature.Disabilities)
{
fields.SetField("EVALUATION", disability.PrimaryDisabilityName + "; ");
}
in theory this should loop through all the disabilities they had entered on the web form and display them one after another while text-wrapping when it reaches the end of each line. But instead it only displays one item one the field.

This isn't a complete answer unfortunately.
First, when you call SetField() you are erasing the current contents of the field and replacing it with your new value. When done in a loop only the last value will ever be stored then. What you need to do is loop through each value and concatenate them into one big string.
string buf = '';
foreach (var disability in formNature.Disabilities)
{
buf += disability.PrimaryDisabilityName + "; ";
}
buf = buf.Trim();
Second, the PDF standard to the best of my knowledge does not support chaining of fields for overflow which is what you are looking for. The only way that I know of to accomplish what you are trying is to actually measure the strings and compare them to the widths of the fields and truncate them as needed. To do this you will need to find the font used for the given field, create a BaseFont from it and use that to Measure the string. Then compare that with the field's rectangle and use only the characters that "fit" into that field. Repeat as needed.
That all said, I would really really recommend that you just edit the PDF and replace the multiple fields with one large field that supports multiple lines. Your life will be much, much easier.

OpenXML: Anyway to see if a Word Document fits one page

While I doubt it, if I open up a word document using OpenXML sdk in C# and add some info, is there any way for me to see if it still fits one page?
If it doesn't I wan't to reduce font size on specific items I added until it fits.
I could write this algorithm if I had the current size in relation to page size with margins and all that.

I ran across this example on another site, don't know if it'll work in your case, as it requires the Office PIA...
var app = new Word.Application();
var doc = app.Documents.Open("path/to/file");
doc.Repaginate()
var pageNumber = doc.BuiltInDocumentProperties("Number of Pages").Value as int;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Copy Form Fields From One PDF to Another - c#

Related

How to fill forms like this using iText for .NET

How do I use C# to fill out a Word document?

Prevent Word document's fields from updating when opened

How do I text wrap to next acrofield in iTextSharp?

OpenXML: Anyway to see if a Word Document fits one page

Categories

Resources