I'm trying to modify a hyperlink inside a word document. The hyperlink is originally pointing to a bookmark in a external document. What I want to do is change it to point to an internal bookmark that as the same Anchor.
Here's the code I use... It seem to work when I what the variables but when I look at the saved document it is exactly like the original.
What is the reason my chances don't persist?
// read file specified in stream
MemoryStream stream = new MemoryStream(File.ReadAllBytes("C:\\TEMPO\\smartbook\\text1.docx"));
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
MainDocumentPart mainPart = doc.MainDocumentPart;
// The first hyperlink -- it happens to be the one I want to modify
Hyperlink hLink = mainPart.Document.Body.Descendants<Hyperlink>().FirstOrDefault();
if (hLink != null)
{
// get hyperlink's relation Id (where path stores)
string relationId = hLink.Id;
if (relationId != string.Empty)
{
// get current relation
HyperlinkRelationship hr = mainPart.HyperlinkRelationships.Where(a => a.Id == relationId).FirstOrDefault();
if (hr != null)
{
// remove current relation
mainPart.DeleteReferenceRelationship(hr);
// add new relation with relation
// mainPart.AddHyperlinkRelationship(new Uri("C:\\TEMPO\\smartbook\\test.docx"), false, relationId);
}
}
// change hyperlink attributes
hLink.DocLocation = "#";
hLink.Id = "";
hLink.Anchor = "TEST";
}
// save stream to a new file
File.WriteAllBytes("C:\\TEMPO\\smartbook\\test.docx", stream.ToArray());
doc.Close();
You have not yet saved your OpenXmlPackage when you write the stream ...
// types that implement IDisposable are better wrapped in a using statement
using(var stream = new MemoryStream(File.ReadAllBytes(#"C:\TEMPO\smartbook\text1.docx")))
{
using(var doc = WordprocessingDocument.Open(stream, true))
{
// do all your changes
// call doc.Close because that SAVES your changes to the stream
doc.Close();
}
// save stream to a new file
File.WriteAllBytes(#"C:\TEMPO\smartbook\test.docx", stream.ToArray());
}
The Close method states explicitly:
Saves and closes the OpenXml package plus all underlying part streams.
You could also set the AutoSave property to true in which case the OpenXMLPackage will be saved when Dispose is called. The using statement I used above would guarantee that that will happen.
Related
I'm really having trouble in editing bookmarks in a Word template using Document.Format.OpenXML and then saving it to a new PDF file.
I cannot use Microsoft.Word.Interop as it gives a COM error on the server.
My code is this:
public static void CreateWordDoc(string templatePath, string destinationPath, Dictionary<string, dynamic> dictionary)
{
byte[] byteArray = File.ReadAllBytes(templatePath);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(stream, true))
{
var bookmarks = (from bm in wordDoc.MainDocumentPart.Document.Body.Descendants<BookmarkStart>()
select bm).ToList();
foreach (BookmarkStart mark in bookmarks)
{
if (mark.Name != "Table" && mark.Name != "_GoBack")
{
UpdateBookmark(dictionary, mark);//Not doing anything
}
else if (mark.Name != "Table")
{
// CreateTable(dictionary, wordDoc, mark);
}
}
File.WriteAllBytes("D:\\RohitDocs\\newfile_rohitsingh.docx", stream.ToArray());
wordDoc.Close();
}
// Save the file with the new name
}
}
private static void UpdateBookmark(Dictionary<string, dynamic> dictionary, BookmarkStart mark)
{
string name = mark.Name;
string value = dictionary[name];
Run run = new Run(new Text(value));
RunProperties props = new RunProperties();
props.AppendChild(new FontSize() { Val = "20" });
run.RunProperties = props;
var paragraph = new DocumentFormat.OpenXml.Wordprocessing.Paragraph(run);
mark.Parent.InsertAfterSelf(paragraph);
paragraph.PreviousSibling().Remove();
mark.Remove();
}
I was trying to replace bookmarks with my text but the UpdateBookmark method doesn't work. I'm writing stream and saving it because I thought if bookmarks are replaced then I can save it to another file.
I think you want to make sure that when you reference mark.Parent that you are getting the correct instance that you are expecting.
Once you get a reference to the correct Paragraph element where your content should go, use the following code to add/swap the run.
// assuming you have a reference to a paragraph called "p"
p.AppendChild<Run>(new Run(new Text(content)) { RunProperties = props });
// and here is some code to remove a run
p.RemoveChild<Run>(run);
To answers the second part of your question, when I did a similar project a few years ago we used iTextSharp to create PDFs from Docx. It worked very well and the API was easy to grok. We even added password encryption and embedded watermarks to the PDFs.
I am working on a project that requires all SQL connection and query information to be stored in XML files. To make my project configurable, I am trying to create a means to let the user configure his sql connection string information (datasource, catalog, username and password) via a series of text boxes. This input will then be saved to the appropriate node within the SQL document.
I can get the current information from the XML file, and display that information within text boxes for the user's review and correction, but I'm encountering an error when it comes time to save the changes.
Here is the code I'm using to update and save the xml document.
protected void submitBtn_Click(object sender, EventArgs e)
{
SPFile file = methods.web.GetFile("MyXMLFile.xml");
myDoc = new XmlDocument();
byte[] bites = file.OpenBinary();
Stream strm1 = new MemoryStream(bites);
myDoc.Load(strm1);
XmlNode node;
node = myDoc.DocumentElement;
foreach (XmlNode node1 in node.ChildNodes)
{
foreach (XmlNode node2 in node1.ChildNodes)
{
if (node2.Name == "name1")
{
if (node2.InnerText != box1.Text)
{
}
}
if (node2.Name == "name2")
{
if (node2.InnerText != box2.Text)
{
}
}
if (node2.Name == "name3")
{
if (node2.InnerText != box3.Text)
{
node2.InnerText = box3.Text;
}
}
if (node2.Name == "name4")
{
if (node2.InnerText != box4.Text)
{
}
}
}
}
myDoc.Save(strm1);
}
Most of the conditionals are empty at this point because I'm still testing.
The code works great until the last line, as I said. At that point, I get the error "Memory Stream is not expandable." I understand that using a memory stream to update a stored file is incorrect, but I can't figure out the right way to do this.
I've tried to implement the solution given in the similar question at Memory stream is not expandable but that situation is different from mine and so the implementation makes no sense to me. Any clarification would be greatly appreciated.
Using the MemoryStream constructor that takes a byte array as an argument creates a non-resizable instance of a MemoryStream. Since you are making changes to the file (and therefore the underlying bytes), you need a resizable MemoryStream. This can be accomplished by using the parameterless constructor of the MemoryStream class and writing the byte array into the MemoryStream.
Try this:
SPFile file = methods.web.GetFile("MyXMLFile.xml");
myDoc = new XmlDocument();
byte[] bites = file.OpenBinary();
using(MemoryStream strm1 = new MemoryStream()){
strm1.Write(bites, 0, (int)bites.Length);
strm1.Position = 0;
myDoc.Load(strm1);
// all of your edits to the file here
strm1.Position = 0;
// save the file back to disk
using(var fs = new FileStream("FILEPATH",FileMode.Create,FileAccess.ReadWrite)){
myDoc.Save(fs);
}
}
To get the FILEPATH for a Sharepoint file, it'd be something along these lines (I don't have a Sharepoint development environment set up right now):
SPFile file = methods.web.GetFile("MyXMLFile.xml")
var filepath = file.ParentFolder.ServerRelativeUrl + "\\" + file.Name;
Or it might be easier to just use the SaveBinary method of the SPFile class like this:
// same code from above
// all of your edits to the file here
strm1.Position = 0;
// don't use a FileStream, just SaveBinary
file.SaveBinary(strm1);
I didn't test this code, but I've used it in Sharepoint solutions to modify XML (mainly OpenXML) documents in Sharepoint lists. Read this blogpost for more information
You could look into using the XDocument class instead of XmlDocument class.
http://msdn.microsoft.com/en-us/library/system.xml.linq.xdocument.aspx
I prefer it because of the simplicity and it eliminates having to use Memory Stream.
Edit: You can append to the file like this:
XDocument doc = XDocument.Load('filePath');
doc.Root.Add(
new XElement("An Element Name",
new XAttribute("An Attribute", "Some Value"),
new XElement("Nested Element", "Inner Text"))
);
doc.Save(filePath);
Or you can search for an element and update like this:
doc.Root.Elements("The element").First(m =>
m.Attribute("An Attribute").Value == "Some value to match").SetElementValue(
"The element to change", "Value to set element to");
doc.Save('filePath');
I have a docx Word document that contains Content Controls bound to data in a CustomXMLPart.
This document (or bookmarks therein) is then included in another Word document by using INCLUDETEXT.
When the first document is included into the second is there any way of getting the CustomXMLPart from the original document (I already have a VSTO Word Addin running in Word looking at the document)?
What I want to do is merge it with the CustomXMLParts already present in the second document so that the Content Controls are still bound to the data in the XMLPart.
Alternatively, is there another way to do this without using the INCLUDETEXT field?
I decided this probably wasn't possible using VSTO and IncludeText fields and investigated using altChunks as an alternative.
I was already doing some processing on the file using the Open XML SDK 2 before opening it so could so the extra work required to merge the document together there.
Although using the altChunk method embeds the whole second document in the first, including its own CustomXmlParts, the CustomXmlParts are discarded by Word when the document is opened and the second merged with the first.
I ended up with code similar to the following. It replaces defined Content Controls with altChunk data and merges specific CustomXmlParts together.
private static void CreateAltChunksInWordDocument(WordprocessingDocument doc, string externalDocumentPath)
{
foreach (var control in doc.ContentControls().ToList()) //Have to do .ToList() on this as when we update the Doc in the loop it stops enumerating otherwise
{
SdtProperties props = control.Elements<SdtProperties>().FirstOrDefault();
if (props == null)
continue;
SdtAlias alias = props.Elements<SdtAlias>().FirstOrDefault();
if (alias == null || !alias.Val.HasValue || alias.Val.Value != "External Template")
continue;
using (WordprocessingDocument externaldoc = WordprocessingDocument.Open(externalDocumentPath, false))
{
//Replace the Content Control with an AltChunk section, and stream in the external file
string altChunkId = "AltChunkId" + Guid.NewGuid().ToString().Replace("{", "").Replace("}", "").Replace("-", "");
AlternativeFormatImportPart chunk = doc.MainDocumentPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
chunk.FeedData(File.OpenRead(externalDocumentPath));
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
OpenXmlElement parent = control.Parent;
parent.InsertAfter(altChunk, control);
control.Remove();
XDocument xDocMain;
CustomXmlPart partMain = MyCommon.GetMyXmlPart(doc.MainDocumentPart, out xDocMain);
XDocument xDocExternal;
CustomXmlPart partExternal = MyCommon.GetMyXmlPart(externaldoc.MainDocumentPart, out xDocExternal);
if (xDocMain != null && partMain != null && xDocExternal != null && partExternal != null)
{
MyCommon.MergeXmlPartFields(xDocMain, xDocExternal);
//Save the updated part
using (Stream outputStream = partMain.GetStream())
{
using (StreamWriter ts = new StreamWriter(outputStream))
{
ts.Write(xDocMain.ToString());
}
}
}
}
}
}
I have an application that uses itextsharp to fill PDF form fields.
I got new requirement from the customer to allow underlining the fields values.
I have read many posts including answers to questions in this site but I could't figure out a way to do it.
Current my code does the following:
Creates a PDFStamper instance
Get the form fields using stamper.AcroFields property
Set the field value using the AcroFields.SetFieldRichValue() method.
But when I am opening the PDF the field is empty.
I verified that the field is set as rich text in the PDF itself.
Any idea what I am doing wrong ?
Here is a snnipest of my code:
FileStream stream = File.Open(targetFile, FileMode.Create);
var pdfStamper = new PdfStamper(new PdfReader(sourceFile), stream);
// Iterate the fields in the PDF
foreach (var fieldName in pdfStamper.AcroFields.Fields.Keys)
{
// Get the field value of the current field
var fieldValue = "<?xml version=\"1.0\"?><body xmlns=\"http://www.w3.org/1999/xtml\" xmlns:xfa=\"http://www.xfa.org/schema/xfa-data/1.0/\" xfa:contentType=\"text/html\" xfa:APIVersion=\"Acrobat:8.0.0\" xfa:spec=\"2.4\"><p style=\"text-align:left\"><b><i>Here is some bold italic text</i></b></p><p style= \"font-size:16pt\">This text uses default text state parameters but changes the font size to 16.</p></body>"
// Set the field value
if (String.IsNullOrEmpty(fieldValue) == false)
{
pdfStamper.AcroFields.SetFieldRichValue(key, fieldValue);
}
}
Edit:
I have revised my code based on Mark Storer's post to a question (http://stackoverflow.com/questions/1454701/adding-rich-text-to-an-acrofield-in-itextsharp). The new code is:
// Create reader to read the source file
var reader = new PdfReader(sourceFile);
// Create a stream for the generated file
var stream = File.Open(targetFile, FileMode.Create);
// Create stamper to generate the new file
var pdfStamper = new PdfStamper(reader, stream);
// Field name and value
var fieldName = "myfield";
var fieldValue = "<?xml version=\"1.0\"?><body xfa:APIVersion=\"Acroform:2.7.0.0\" xfa:spec=\"2.1\" xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:xfa=\"http://www.xfa.org/schema/xfa-data/1.0/\"><p dir=\"ltr\" style=\"margin-top:0pt;margin-bottom:0pt;font-family:Helvetica;font-size:12pt\"><b>write line1 bold</b></p</body>";
// Output stream for the temporary file that should contain the apearance of the field
var msOutput = new FileStream(#"d:\temp.pdf", FileMode.Create);
// string reader to read the field value
var textReader = new StringReader(fieldValue);
// Create new document
var document = new Document(pdfStamper.AcroFields.GetFieldPositions(fieldName)[0].position);
// writer for the new doucment
var writer = PdfWriter.GetInstance(document, msOutput);
// Open the document
document.Open();
// Get elements to append to the doucment
var list = HTMLWorker.ParseToList(textReader, null);
// Append elements to the doucment
foreach (var element in list)
{
document.Add(element);
}
// close the documnet
document.Close();
// Append push button that contains the generated content as its apearance
// this approach is based on the suggestion from Mark storer that can be found in:
// http://stackoverflow.com/questions/1454701/adding-rich-text-to-an-acrofield-in-itextsharp
var button = new PushbuttonField(pdfStamper.Writer, pdfStamper.AcroFields.GetFieldPositions(fieldName)[0].position, fieldName + "_")
{
Layout = PushbuttonField.LAYOUT_ICON_ONLY,
BackgroundColor = null,
Template = pdfStamper.Writer.GetImportedPage(new PdfReader(targetFile, 1)
};
pdfStamper.AddAnnotation(button.Field, 1);
pdfStamper.FormFlattening = true;
pdfStamper.Close();
pdfStamper.Dispose();
But the problem now is that the temporary document contains no content....
Any ideas ?
There's a lot of code above, almost 400 lines. In the future, try to distill everything down to a very simple and reproducible test case.
When using SetFieldRichValue you also need to set the PdfStamper's AcroFields.GenerateAppearances property to false:
stamper.AcroFields.GenerateAppearances = false;
You can read more about this here along with some caveats and other workarounds.
I managed to resolve it with little trick - I used Anchor (hyperling) element inside ColumnText element and position it above the form field. The anchor is displayed with underline by default. In order to avoid the hand marker when the user hover the mouse on the anchor, I set the "Reference" property of the anchor to null.
Here is the code I used:
// Create reader
var reader = new PdfReader(sourceFilePathAndName);
// Create stream for the target file
var stream = File.Open(targetFilePathAndName, FileMode.Create);
// Create the stamper
var pdfStamper = new PdfStamper(reader, stream);
const string fieldName = "MyField";
// Get the position of the field
var targetPosition = pdfStamper.AcroFields.GetFieldPositions(fieldName)[0].position;
// Set the font
var fontNormal = FontFactory.GetFont("Arial", 16, Font.UNDERLINE, BaseColor.BLACK);
// Create the anchor
var url = new Anchor("some default text", fontNormal) { Reference = null };
// Create the element that will contain the anchor and allow to position it anywhere on the document
var data = new ColumnText(pdfStamper.GetOverContent(1));
// Add the anchor to its container
data.SetSimpleColumn(url, targetPosition.Left, targetPosition.Bottom, targetPosition.Right, targetPosition.Top, 0, 0);
// Write the content to the document
data.Go();
pdfStamper.FormFlattening = true;
pdfStamper.Close();
pdfStamper.Dispose();
I'm developing a web application with asp.net and I have a file called Template.docx that works like a template to generate other reports. Inside this Template.docx I have some MergeFields (Title, CustomerName, Content, Footer, etc) to replace for some dynamic content in C#.
I would like to know, how can I put a content in a mergefield in docx ?
I don't know if MergeFields is the right way to do this or if there is another way. If you can suggest me, I appreciate!
PS: I have openxml referenced in my web application.
Edits:
private MemoryStream LoadFileIntoStream(string fileName)
{
MemoryStream memoryStream = new MemoryStream();
using (FileStream fileStream = File.OpenRead(fileName))
{
memoryStream.SetLength(fileStream.Length);
fileStream.Read(memoryStream.GetBuffer(), 0, (int) fileStream.Length);
memoryStream.Flush();
fileStream.Close();
}
return memoryStream;
}
public MemoryStream GenerateWord()
{
string templateDoc = "C:\\temp\\template.docx";
string reportFileName = "C:\\temp\\result.docx";
var reportStream = LoadFileIntoStream(templateDoc);
// Copy a new file name from template file
//File.Copy(templateDoc, reportFileName, true);
// Open the new Package
Package pkg = Package.Open(reportStream, FileMode.Open, FileAccess.ReadWrite);
// Specify the URI of the part to be read
Uri uri = new Uri("/word/document.xml", UriKind.Relative);
PackagePart part = pkg.GetPart(uri);
XmlDocument xmlMainXMLDoc = new XmlDocument();
xmlMainXMLDoc.Load(part.GetStream(FileMode.Open, FileAccess.Read));
// replace some keys inside xml (it will come from database, it's just a test)
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_customer", "My Customer Name");
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_title", "Report of Documents");
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_content", "Content of Document");
// Open the stream to write document
StreamWriter partWrt = new StreamWriter(part.GetStream(FileMode.Open, FileAccess.Write));
//doc.Save(partWrt);
xmlMainXMLDoc.Save(partWrt);
partWrt.Flush();
partWrt.Close();
reportStream.Flush();
pkg.Close();
return reportStream;
}
PS: When I convert MemoryStream to a file, I got a corrupted file. Thanks!
I know this is an old post, but I could not get the accepted answer to work for me. The project linked would not even compile (which someone has already commented in that link). Also, it seems to use other Nuget packages like WPFToolkit.
So I'm adding my answer here in case someone finds it useful. This only uses the OpenXML SDK 2.5 and also the WindowsBase v4. This works on MS Word 2010 and later.
string sourceFile = #"C:\Template.docx";
string targetFile = #"C:\Result.docx";
File.Copy(sourceFile, targetFile, true);
using (WordprocessingDocument document = WordprocessingDocument.Open(targetFile, true))
{
// If your sourceFile is a different type (e.g., .DOTX), you will need to change the target type like so:
document.ChangeDocumentType(WordprocessingDocumentType.Document);
// Get the MainPart of the document
MainDocumentPart mainPart = document.MainDocumentPart;
var mergeFields = mainPart.RootElement.Descendants<FieldCode>();
var mergeFieldName = "SenderFullName";
var replacementText = "John Smith";
ReplaceMergeFieldWithText(mergeFields, mergeFieldName, replacementText);
// Save the document
mainPart.Document.Save();
}
private void ReplaceMergeFieldWithText(IEnumerable<FieldCode> fields, string mergeFieldName, string replacementText)
{
var field = fields
.Where(f => f.InnerText.Contains(mergeFieldName))
.FirstOrDefault();
if (field != null)
{
// Get the Run that contains our FieldCode
// Then get the parent container of this Run
Run rFldCode = (Run)field.Parent;
// Get the three (3) other Runs that make up our merge field
Run rBegin = rFldCode.PreviousSibling<Run>();
Run rSep = rFldCode.NextSibling<Run>();
Run rText = rSep.NextSibling<Run>();
Run rEnd = rText.NextSibling<Run>();
// Get the Run that holds the Text element for our merge field
// Get the Text element and replace the text content
Text t = rText.GetFirstChild<Text>();
t.Text = replacementText;
// Remove all the four (4) Runs for our merge field
rFldCode.Remove();
rBegin.Remove();
rSep.Remove();
rEnd.Remove();
}
}
What the code above does is basically this:
Identify the 4 Runs that make up the merge field named "SenderFullName".
Identify the Run that contains the Text element for our merge field.
Remove the 4 Runs.
Update the text property of the Text element for our merge field.
UPDATE
For anyone interested, here is a simple static class I used to help me with replacing merge fields.
Frank Fajardo's answer was 99% of the way there for me, but it is important to note that MERGEFIELDS can be SimpleFields or FieldCodes.
In the case of SimpleFields, the text runs displayed to the user in the document are children of the SimpleField.
In the case of FieldCodes, the text runs shown to the user are between the runs containing FieldChars with the Separate and the End FieldCharValues. Occasionally, several text containing runs exist between the Separate and End Elements.
The code below deals with these problems. Further details of how to get all the MERGEFIELDS from the document, including the header and footer is available in a GitHub repository at https://github.com/mcshaz/SimPlanner/blob/master/SP.DTOs/Utilities/OpenXmlExtensions.cs
private static Run CreateSimpleTextRun(string text)
{
Run returnVar = new Run();
RunProperties runProp = new RunProperties();
runProp.Append(new NoProof());
returnVar.Append(runProp);
returnVar.Append(new Text() { Text = text });
return returnVar;
}
private static void InsertMergeFieldText(OpenXmlElement field, string replacementText)
{
var sf = field as SimpleField;
if (sf != null)
{
var textChildren = sf.Descendants<Text>();
textChildren.First().Text = replacementText;
foreach (var others in textChildren.Skip(1))
{
others.Remove();
}
}
else
{
var runs = GetAssociatedRuns((FieldCode)field);
var rEnd = runs[runs.Count - 1];
foreach (var r in runs
.SkipWhile(r => !r.ContainsCharType(FieldCharValues.Separate))
.Skip(1)
.TakeWhile(r=>r!= rEnd))
{
r.Remove();
}
rEnd.InsertBeforeSelf(CreateSimpleTextRun(replacementText));
}
}
private static IList<Run> GetAssociatedRuns(FieldCode fieldCode)
{
Run rFieldCode = (Run)fieldCode.Parent;
Run rBegin = rFieldCode.PreviousSibling<Run>();
Run rCurrent = rFieldCode.NextSibling<Run>();
var runs = new List<Run>(new[] { rBegin, rCurrent });
while (!rCurrent.ContainsCharType(FieldCharValues.End))
{
rCurrent = rCurrent.NextSibling<Run>();
runs.Add(rCurrent);
};
return runs;
}
private static bool ContainsCharType(this Run run, FieldCharValues fieldCharType)
{
var fc = run.GetFirstChild<FieldChar>();
return fc == null
? false
: fc.FieldCharType.Value == fieldCharType;
}
You could try http://www.codeproject.com/KB/office/Fill_Mergefields.aspx which uses the Open XML SDK to do this.