How do I use C# to fill out a Word document?

How do I use C# to fill out a Word document? - c#

I have a Word document, letter.docx, that is a letter I intend to mail to hundreds of people for a party. The letter is already composed and has been formatted in its own special way with varying type sizes and fonts. It's set and ready to go, with placeholders where I have to fill out variables that change like Name, Address, phone number, etc.
Now, I would like to write a C# program where a user can type in variable things like Name, Address, etc., into a form, hit a button, and produce letter.docx with the right information filled in at the right places.
I understand Word has features that allow you do this, but I really want to do this in C#.

Of course you can do it. Use Microsoft.Office.Interop.Word reference in your project.
First bookmark all the fields you want to be updated in the document from 'insert' tab (eg. NameField is bookmarked with tag 'name_field'). Then, in your C# code add the following:
Microsoft.Office.Interop.Word.Application wordApp = null;
wordApp = new Microsoft.Office.Interop.Word.Application();
wordApp.Visible = true;
Document wordDoc = wordApp.Documents.Open(#"C:\test.docx");
Bookmark bkm = wordDoc.Bookmarks["name_field"];
Microsoft.Office.Interop.Word.Range rng = bkm.Range;
rng.Text = "Adams Laura"; //Get value from any where
Remember to properly save & close the document.(You can see this)

I don't know of anything built into the language, but the example here seems to do exactly what you want.
If you can provide specific examples of what you want to do (are the placeholders Fields? specifically name bits of text?), I can probably give you a more refined answer that directly targets your problem.

Word Provides COM objects that one can make use of in C#
Add a reference to the Microsoft office interop under the COM tab in the add reference dialog
Also, see this question:
Filling in FIelds in work using C#

I had a situation where I needed to fill out some MS Word forms, so I used something similar to the following code (make sure you reference Microsoft.Office.Interop.Word; I used version 14, but you should adjust it to your own scenario):
// FormData is a custom container type that holds data... you'll have your own.
public static void FillOutForm(FormData data)
{
var app = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document doc = null;
try
{
var filePath = "Your file path.";
doc = app.Documents.Add(filePath);
doc.Activate();
// Loop over the form fields and fill them out.
foreach(Microsoft.Office.Interop.Word.FormField field in doc.FormFields)
{
switch (field.Name)
{
// Text field case.
case "textField1":
field.Range.Text = data.SomeText;
break;
// Check box case.
case "checkBox1":
field.CheckBox.Value = data.IsSomethingTrue;
break;
default:
// Throw an error or do nothing.
break;
}
}
// Save a copy.
var newFilePath = "Your new file path.";
doc.SaveAs2(newFilePath);
}
catch (Exception e)
{
// Perform your error logging and handling here.
}
finally
{
// Make sure you close things out.
// I tend not to save over the original form, so I wouldn't save
// changes to it -- hence the option I chose here.
doc.Close(
Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
app.Quit();
}
}
As you can see, it's really not that hard at all. There are some other options on forms, so you'll have to research them, but the most general ones, the check box and the text box, are the ones I demonstrated here. If you didn't create a form, I suggest going through and making sure that you know all the fields, as that's what you'll need for this.

Related

Find a range of text with specific formatting with Word interop

I have a MS Word add-in that needs to extract text from a range of text based solely on its formatting: in my case in particular, if the text is underlined or struck through, the range of characters/words that are underlined or struck through need to be found so that I can keep track of them.
My first idea was to use Range.Find, as is outlined here, but that won't work when I have no idea what the string is that I'm looking for:
var rng = doc.Range(someStartRange, someEndRange);
rng.Find.Forward = true;
rng.Find.Format = true;
// I removed this line in favor of putting it inside Execute()
//rng.Find.Text = "";
rng.Find.Font.Underline = WdUnderline.wdUnderlineSingle;
// this works
rng.Find.Execute("");
int foundNumber = 0;
while (rng.Find.Found)
{
foundNumber++;
// this needed to be added as well, as per the link above
rng.Find.Execute("");
}
MessageBox.Show("Underlined strings found: " + foundNumber.ToString());
I would happily parse the text myself, but am not sure how to do this while still knowing the formatting. Thanks in advance for any ideas.
EDIT:
I changed my code to fix the find underline issue, and with that change the while loop never terminates. More specifically, rng.Find.Found finds the underlined text, but it finds the same text over and over, and never terminates.
EDIT 2:
Once I added the additional Execute() call inside the while loop, the find functioned as needed.

You need
rng.Find.Font.Underline = wdUnderline.wdUnderlineSingle;
(At the moment you are setting the formatting for the specified rng, rather than the formatting for the Find)

c# Editing word file ( setting choice-fields)

i`m using function below to fill properly data in doc.file by bookmarks and it works good.
public void findAndReplace(Word.Document doc, object bookmark, object replaceWith)
{
Word.Range rng = doc.Bookmarks.get_Item(ref bookmark).Range;
rng.Text = replaceWith.ToString();
object oRng = rng;
doc.Bookmarks.Add(bookmark.ToString(), ref oRng);
}
I`ve got problem with setting values of choice-fields in word file.
My questions is, is it even possible to set this kind of data from my c# application ? There is any method to generate Selected or unselected fields like for example in point 21 in link below or only set its value ?
http://www.fotosik.pl/pokaz_obrazek/c2409113d79afeba.html
And last question is it possible and reasonable to generate whole report from doc file ?
I`m looking for some solution which helps to generate completely my declaration by c#.

Yes that is possible by programmatically adding Content Controls.
This is how you would add a Check Box Content Control:
this.Paragraphs[1].Range.InsertParagraphBefore();
this.Paragraphs[1].Range.Select();
Microsoft.Office.Tools.Word.ContentControl checkBoxControl1 =
this.Controls.AddContentControl("checkBoxControl1", Word.WdContentControlType.wdContentControlCheckBox);
checkBoxControl1.Checked = true;
You can generate entire documents from c#. I would suggest this tutorial from MSDN about creating templates by using content controls.

Prevent Word document's fields from updating when opened

I wrote a utility for another team that recursively goes through folders and converts the Word docs found to PDF by using Word Interop with C#.
The problem we're having is that the documents were created with date fields that update to today's date before they get saved out. I found a method to disable updating fields before printing, but I need to prevent the fields from updating on open.
Is that possible? I'd like to do the fix in C#, but if I have to do a Word macro, I can.

As described in Microsoft's endless maze of documentation you can lock the field code. For example in VBA if I have a single date field in the body in the form of
{DATE \# "M/d/yyyy h:mm:ss am/pm" \* MERGEFORMAT }
I can run
ActiveDocument.Fields(1).Locked = True
Then if I make a change to the document, save, then re-open, the field code will not update.
Example using c# Office Interop:
Word.Application wordApp = new Word.Application();
Word.Document wordDoc = wordApp.ActiveDocument;
wordDoc.Fields.Locked = 1; //its apparently an int32 rather than a bool
You can place the code in the DocumentOpen event. I'm assuming you have an add-in which subscribes to the event. If not, clarify, as that can be a battle on its own.
EDIT: In my testing, locking fields in this manner locks them across all StoryRanges, so there is no need to get the field instances in headers, footers, footnotes, textboxes, ..., etc. This is a surprising treat.

Well, I didn't find a way to do it with Interop, but my company did buy Aspose.Words and I wrote a utility to convert the Word docs to TIFF images. The Aspose tool won't update fields unless you explicitly tell it to. Here's a sample of the code I used with Aspose. Keep in mind, I had a requirement to convert the Word docs to single page TIFF images and I hard-coded many of the options because it was just a utility for myself on this project.
private static bool ConvertWordToTiff(string inputFilePath, string outputFilePath)
{
try
{
Document doc = new Document(inputFilePath);
for (int i = 0; i < doc.PageCount; i++)
{
ImageSaveOptions options = new ImageSaveOptions(SaveFormat.Tiff);
options.PageIndex = i;
options.PageCount = 1;
options.TiffCompression = TiffCompression.Lzw;
options.Resolution = 200;
options.ImageColorMode = ImageColorMode.BlackAndWhite;
var extension = Path.GetExtension(outputFilePath);
var pageNum = String.Format("-{0:000}", (i+1));
var outputPageFilePath = outputFilePath.Replace(extension, pageNum + extension);
doc.Save(outputPageFilePath, options);
}
return true;
}
catch (Exception ex)
{
LogError(ex);
return false;
}
}

I think a new question on SO is appropriate then, because this will require XML processing rather than just Office Interop. If you have both .doc and .docx file types to convert, you might require two separate solutions: one for WordML (Word 2003 XML format), and another for OpenXML (Word 2007/2010/2013 XML format), since you cannot open the old file format and save as the new without the fields updating.
Inspecting the OOXML of a locked field shows us this w:fldLock="1" attribute. This can be inserted using appropriate XML processing against the document, such as through the OOXML SDK, or through a standard XSLT transform.
Might be helpful: this how-do-i-unlock-a-content-control-using-the-openxml-sdk-in-a-word-2010-document question might be similar situation but for Content Controls. You may be able to apply the same solution to Fields, if the the Lock and LockingValues types apply the same way to fields. I am not certain of this however.
To give more confidence that this is the way to do it, see example of this vendor's solution for the problem. If you need to develop this in-house, then openxmldeveloper.org is a good place to start - look for Eric White's examples for manipulating fields such as this.

C# Word Interop - Spell Checking in a Certain Language

For a customer of mine I need to force the spell checking in a certain language.
I have explored the MSDN documentation and found that when calling the CheckSpelling() method in the active document, it will invoke the spelling check. This method has parameters for custom dictionaries.
My problem is that I can't find anything about those dictionaries or how to use them.
Also there is still the possibility that there is of course another way to do this.
Can anybody boost me in the right direction?

Found my solution:
foreach (Range range in activeDocument.Words)
{
range.LanguageID = WdLanguageID.wdFrenchLuxembourg;
}
Edit after comment
Since my activedocument is in a variable I seem to lose the static Range property. I found a work arround by doing the following. (lan is my variable where i keep my WdLanguageId)
object start = activeDocument.Content.Start;
object end = activeDocument.Content.End;
activeDocument.Range(ref start, ref end).LanguageID = lan;
thanks #Adrianno for all the help!

The Spell Checker uses the language of the text to select rules and dictionaries (look here to check how it works).
You have to set the text language to what you need and then SC will use that language. Follow this link for more details:
http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.language.aspx

I have been working with this lately and thought I would add a bit to the already given answers.
To get a list of spelling errors in the document for a certain language, doing the following would get you going:
// Set the proofing language
myDocument.Content.LanguageID = WdLanguageID.wdDanish;
// Get the spelling errors (returns a ProofreadingErrors collection)
var errors = myDocument.SpellingErrors;
// There is no "ProofreadingError" object -> errors are accessed as Ranges
foreach (Range proofreadingError in errors)
Console.WriteLine(proofreadingError.Text);
As pointed out by Adriano, the key is to specify the language of the document content at first, and then you can access the spelling errors for the given language. I have tested this (Word Interop API version 15, Office 2013), and it works.
If you want to get suggestions for each of the misspelled words as well, I suggest you take a look at my previous answer to that issue: https://stackoverflow.com/a/14202099/700926
In that answer I provide sample code as well as links to relevant documentation for how that is done. In particular, the sample covers how to carry out spell checking of a given word in a certain language (of your choice) using Word Interop. The sample also covers how to access the suggestions returned by Word.
Finally, I have a couple of notes:
In contrast to the current accepted answer (your own) - this approach is much faster since it do not have to iterate through each word. I have been working with Word Interop for reports (100+ pages) and trust me, you don't want to sit and wait for that iteration to finish.
Information regarding the SpellingErrors property can be found here.
Information regarding the non-existence of a ProofreadingError object can be found here.

Never user foreach statements when accessing Office object. Most of the Office objects are COM object, and using foreach leads to memory leaks.
The following is a piece of working code
Microsoft.Office.Interop.Word.ProofreadingErrors errorCollection = null;
try
{
errorCollection = Globals.ThisAddIn.Application.ActiveDocument.SpellingErrors;
// Indexes start at 1 in Office objects
for (int i = 1; i <= errorCollection .Count; i++)
{
int start = errorCollection[i].Start;
int end = errorCollection[i].End;
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
finally
{
// Release the COM objects here
// as finally shall be always called
if (errorCollection != null)
{
Marshal.ReleaseComObject(errorCollection);
errorCollection = null;
}
}

Copy Form Fields From One PDF to Another

I have a situation where I need to copy all of the form fields from one PDF to another. The purpose is to automate the overlaying of the fields when small edits are made to the underlying Word pages.
I've been using the trial version of Aspose.Pdf.Kit, and I'm able to copy everything but Radio buttons to a new form. However Aspose doesn't support copying the radio buttons, which completely nullifies it's usefulness, not to mention their customer support has been subpar.
In any event, I'm looking for some sort of library or plug-in that does support copying all types of form fields.
Does anyone have any ideas?
Thanks,
~DJ

Yes, it is possible. No, setField() won't do the trick... madisonw's code will copy the field values, but not the fields themselves.
OTOH, it really isn't that hard.
Something like:
PdfReader currentReader = new PdfReader( CURRENT_PDF_PATH ); // throws
PdfReader pdfFromWord = new PdfReader( TWEAKED_PDF_FROM_WORD_PATH ); // throws
PdfStamper stamper = new PdfStamper( currentReader , outputFile ); //throws
for( int i = 1; i <= tempalteReader.getNumberOfPages(); ++i) {
stamper.replacePage( pdfFromWord, i, i );
}
stamper.close(); // throws
I'm ignoring a bunch of exceptions, and am writing in Java, but C# should look virtually identical.
Also, this code ignores the case where someone ADDS A PAGE... which would get quite thorny. Was it added before or after the pages with fields on them? Did those pages reflow at all, requiring you to move the fields? At that point you really need a manual process with Acrobat Pro.

I agree with Oded, iTextSharp should be able to do the job. I've used code similar the following snippet and never had problems with any field types. I'm sure there must have been a radio button in the mix.
private void CopyFields(PdfStamper targetFile, PdfReader sourceFile){
{
foreach (DictionaryEntry de in targetFile.AcroFields.Fields)
{
string fieldName = de.Key.ToString();
target.AcroFields.SetField(fieldName, sourceFile.AcroFields.GetField(fieldName));
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.