I'm developing a web application with asp.net and I have a file called Template.docx that works like a template to generate other reports. Inside this Template.docx I have some MergeFields (Title, CustomerName, Content, Footer, etc) to replace for some dynamic content in C#.
I would like to know, how can I put a content in a mergefield in docx ?
I don't know if MergeFields is the right way to do this or if there is another way. If you can suggest me, I appreciate!
PS: I have openxml referenced in my web application.
Edits:
private MemoryStream LoadFileIntoStream(string fileName)
{
MemoryStream memoryStream = new MemoryStream();
using (FileStream fileStream = File.OpenRead(fileName))
{
memoryStream.SetLength(fileStream.Length);
fileStream.Read(memoryStream.GetBuffer(), 0, (int) fileStream.Length);
memoryStream.Flush();
fileStream.Close();
}
return memoryStream;
}
public MemoryStream GenerateWord()
{
string templateDoc = "C:\\temp\\template.docx";
string reportFileName = "C:\\temp\\result.docx";
var reportStream = LoadFileIntoStream(templateDoc);
// Copy a new file name from template file
//File.Copy(templateDoc, reportFileName, true);
// Open the new Package
Package pkg = Package.Open(reportStream, FileMode.Open, FileAccess.ReadWrite);
// Specify the URI of the part to be read
Uri uri = new Uri("/word/document.xml", UriKind.Relative);
PackagePart part = pkg.GetPart(uri);
XmlDocument xmlMainXMLDoc = new XmlDocument();
xmlMainXMLDoc.Load(part.GetStream(FileMode.Open, FileAccess.Read));
// replace some keys inside xml (it will come from database, it's just a test)
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_customer", "My Customer Name");
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_title", "Report of Documents");
xmlMainXMLDoc.InnerXml = xmlMainXMLDoc.InnerXml.Replace("field_content", "Content of Document");
// Open the stream to write document
StreamWriter partWrt = new StreamWriter(part.GetStream(FileMode.Open, FileAccess.Write));
//doc.Save(partWrt);
xmlMainXMLDoc.Save(partWrt);
partWrt.Flush();
partWrt.Close();
reportStream.Flush();
pkg.Close();
return reportStream;
}
PS: When I convert MemoryStream to a file, I got a corrupted file. Thanks!
I know this is an old post, but I could not get the accepted answer to work for me. The project linked would not even compile (which someone has already commented in that link). Also, it seems to use other Nuget packages like WPFToolkit.
So I'm adding my answer here in case someone finds it useful. This only uses the OpenXML SDK 2.5 and also the WindowsBase v4. This works on MS Word 2010 and later.
string sourceFile = #"C:\Template.docx";
string targetFile = #"C:\Result.docx";
File.Copy(sourceFile, targetFile, true);
using (WordprocessingDocument document = WordprocessingDocument.Open(targetFile, true))
{
// If your sourceFile is a different type (e.g., .DOTX), you will need to change the target type like so:
document.ChangeDocumentType(WordprocessingDocumentType.Document);
// Get the MainPart of the document
MainDocumentPart mainPart = document.MainDocumentPart;
var mergeFields = mainPart.RootElement.Descendants<FieldCode>();
var mergeFieldName = "SenderFullName";
var replacementText = "John Smith";
ReplaceMergeFieldWithText(mergeFields, mergeFieldName, replacementText);
// Save the document
mainPart.Document.Save();
}
private void ReplaceMergeFieldWithText(IEnumerable<FieldCode> fields, string mergeFieldName, string replacementText)
{
var field = fields
.Where(f => f.InnerText.Contains(mergeFieldName))
.FirstOrDefault();
if (field != null)
{
// Get the Run that contains our FieldCode
// Then get the parent container of this Run
Run rFldCode = (Run)field.Parent;
// Get the three (3) other Runs that make up our merge field
Run rBegin = rFldCode.PreviousSibling<Run>();
Run rSep = rFldCode.NextSibling<Run>();
Run rText = rSep.NextSibling<Run>();
Run rEnd = rText.NextSibling<Run>();
// Get the Run that holds the Text element for our merge field
// Get the Text element and replace the text content
Text t = rText.GetFirstChild<Text>();
t.Text = replacementText;
// Remove all the four (4) Runs for our merge field
rFldCode.Remove();
rBegin.Remove();
rSep.Remove();
rEnd.Remove();
}
}
What the code above does is basically this:
Identify the 4 Runs that make up the merge field named "SenderFullName".
Identify the Run that contains the Text element for our merge field.
Remove the 4 Runs.
Update the text property of the Text element for our merge field.
UPDATE
For anyone interested, here is a simple static class I used to help me with replacing merge fields.
Frank Fajardo's answer was 99% of the way there for me, but it is important to note that MERGEFIELDS can be SimpleFields or FieldCodes.
In the case of SimpleFields, the text runs displayed to the user in the document are children of the SimpleField.
In the case of FieldCodes, the text runs shown to the user are between the runs containing FieldChars with the Separate and the End FieldCharValues. Occasionally, several text containing runs exist between the Separate and End Elements.
The code below deals with these problems. Further details of how to get all the MERGEFIELDS from the document, including the header and footer is available in a GitHub repository at https://github.com/mcshaz/SimPlanner/blob/master/SP.DTOs/Utilities/OpenXmlExtensions.cs
private static Run CreateSimpleTextRun(string text)
{
Run returnVar = new Run();
RunProperties runProp = new RunProperties();
runProp.Append(new NoProof());
returnVar.Append(runProp);
returnVar.Append(new Text() { Text = text });
return returnVar;
}
private static void InsertMergeFieldText(OpenXmlElement field, string replacementText)
{
var sf = field as SimpleField;
if (sf != null)
{
var textChildren = sf.Descendants<Text>();
textChildren.First().Text = replacementText;
foreach (var others in textChildren.Skip(1))
{
others.Remove();
}
}
else
{
var runs = GetAssociatedRuns((FieldCode)field);
var rEnd = runs[runs.Count - 1];
foreach (var r in runs
.SkipWhile(r => !r.ContainsCharType(FieldCharValues.Separate))
.Skip(1)
.TakeWhile(r=>r!= rEnd))
{
r.Remove();
}
rEnd.InsertBeforeSelf(CreateSimpleTextRun(replacementText));
}
}
private static IList<Run> GetAssociatedRuns(FieldCode fieldCode)
{
Run rFieldCode = (Run)fieldCode.Parent;
Run rBegin = rFieldCode.PreviousSibling<Run>();
Run rCurrent = rFieldCode.NextSibling<Run>();
var runs = new List<Run>(new[] { rBegin, rCurrent });
while (!rCurrent.ContainsCharType(FieldCharValues.End))
{
rCurrent = rCurrent.NextSibling<Run>();
runs.Add(rCurrent);
};
return runs;
}
private static bool ContainsCharType(this Run run, FieldCharValues fieldCharType)
{
var fc = run.GetFirstChild<FieldChar>();
return fc == null
? false
: fc.FieldCharType.Value == fieldCharType;
}
You could try http://www.codeproject.com/KB/office/Fill_Mergefields.aspx which uses the Open XML SDK to do this.
Related
I build a small application using .NET 6 that is replacing values inside a word document and saving a copy.
Some keys are replaced with other files content using an AltChunk.
Using file A, in which I merge AltChunk1, the output is working fine.
Using file B with same AltChunk1, the output produce the error "found unreadable content" when opening with Word.
Using file B and a different AltChunk file (even the same after I trimmed it) can, in some cases, work.
I don't have any clue what the issue might be.
I tried comparing files using OpenXML productivity tool however:
File A and File B have a lot of differences, it is really hard to find anything that would explain this behavior
They are identical in the place the AltChunk is put.
Tried comparing the not working result with what word is creating with a repair but word is not keeping the AltChunk, it completely merges content of AltChunk with File B making any comparison almost impossible with my non-working result.
Here is the code Is use:
First method is creating the AltChunk from file, then calls methods used to replace "keys" with the wanted value (including case where the key is split accross various runs)
internal static void MergeOutSideDocument(string key, string filePath, IEnumerable<string> outsideDocs)
{
if (string.IsNullOrEmpty(key)) throw new ArgumentException("Cannot replace empty key.");
if (!File.Exists(filePath) || outsideDocs.Any(path => !File.Exists(path))) throw new FileNotFoundException();
using WordprocessingDocument doc = WordprocessingDocument.Open(filePath, true);
List<OpenXmlElement> altChunks = new();
foreach (var outsideDoc in outsideDocs)
{
var existingIds = doc.MainDocumentPart.Document.Body.Descendants<AltChunk>();
string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString();
MainDocumentPart mainPart = doc.MainDocumentPart;
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (FileStream fileStream = File.Open(outsideDoc, FileMode.Open))
chunk.FeedData(fileStream);
altChunks.Add(new AltChunk()
{
Id = altChunkId
});
inMemoryAltChunkIds.Add(altChunkId);
}
var body = doc.MainDocumentPart.Document.Body;
SetElementForKey(key, altChunks,
body.Descendants<Paragraph>().First(par => par.Contains(key)),
body);
}
private static void SetElementForKey(string key, List<OpenXmlElement> replacements, OpenXmlElement el, Body body)
{
List<Run> previousRuns = new();
if (el?.InnerText.Contains(key) != true) return;
for (int i = 0; i <= el.Descendants<Run>().Count(); i++)
{
var innerText = string.Join("", previousRuns.Select(r => r.InnerText));
if (innerText.Contains(key))
{
var usedRuns = GetRequiredRunsForText(previousRuns, key);
var firstRun = usedRuns.First();
MergeRunsWithKey(key, usedRuns, firstRun);
var usedRun = usedRuns.First();
var firstPart = usedRun.InnerText.IndexOf(key) != -1 ? usedRun.InnerText[..usedRun.InnerText.IndexOf(key)] : "";
ReplaceText(key, "", usedRun);
foreach (var replacement in replacements)
el.Parent.InsertAfter(replacement, el);
if (string.IsNullOrEmpty(usedRun.InnerText)) usedRun.Remove();
if (string.IsNullOrEmpty(el.InnerText)) el.Remove();
break;
}
else
{
previousRuns.Add(el.Descendants<Run>().ElementAt(i));
}
}
}
private static void MergeRunsWithKey(string key, List<Run> usedRuns, Run firstRun)
{
while (!usedRuns.First().InnerText.Contains(key))
{
AddText(usedRuns.Skip(1).First().InnerText, firstRun);
usedRuns.Skip(1).First().Remove();
usedRuns.RemoveAt(1);
}
}
private static void AddText(string newText, Run run)
{
Text text = run.Elements<Text>().LastOrDefault();
if (text == null)
{
run.Append(new Text());
text = run.Elements<Text>().Last();
}
text.Text += newText;
if (text.Text.StartsWith(" ") || text.Text.EndsWith(" "))
text.Space = SpaceProcessingModeValues.Preserve;
}
What can I do to understand where the problem lies?
I tried replacing some values I don't understand in File B with the ones from File A (header and footer have rectangles with different gfxdata values, the "recovered" from word was setting the same values as File A).
I tried a different way of generating the AltChunkIds and storing a global list for the file.
I tried comparing various parts of the documents (File A and B or Fil B's result and its recovered version). There are differences, but too many and none seem to be relevant.
I have folder full of *.msg files saved from Outlook and I'm trying to convert them to Word.
There is a loop that loads each *.msg as MailItem and saves them.
public static ConversionResult ConvertEmailsToWord(this Outlook.Application app, string source, string target)
{
var word = new Word.Application();
var emailCounter = 0;
var otherCounter = 0;
var directoryTree = new PhysicalDirectoryTree();
foreach (var node in directoryTree.Walk(source))
{
foreach (var fileName in node.FileNames)
{
var currentFile = Path.Combine(node.DirectoryName, fileName);
var branch = Regex.Replace(node.DirectoryName, $"^{Regex.Escape(source)}", string.Empty).Trim('\\');
Debug.Print($"Processing file: {currentFile}");
// This is an email. Convert it to Word.
if (Regex.IsMatch(fileName, #"\.msg$"))
{
if (app.Session.OpenSharedItem(currentFile) is MailItem item)
{
if (item.SaveAs(word, Path.Combine(target, branch), fileName))
{
emailCounter++;
}
item.Close(SaveMode: OlInspectorClose.olDiscard);
}
}
// This is some other file. Copy it as is.
else
{
Directory.CreateDirectory(Path.Combine(target, branch));
File.Copy(currentFile, Path.Combine(target, branch, fileName), true);
otherCounter++;
}
}
}
word.Quit(SaveChanges: false);
return new ConversionResult
{
EmailCount = emailCounter,
OtherCount = otherCounter
};
}
The save method looks likes this:
public static bool SaveAs(this MailItem mail, Word.Application word, string path, string name)
{
Directory.CreateDirectory(path);
name = Path.Combine(path, $"{Path.GetFileNameWithoutExtension(name)}.docx");
if (File.Exists(name))
{
return false;
}
var copy = mail.GetInspector.WordEditor as Word.Document;
copy.Content.Copy();
var doc = word.Documents.Add();
doc.Content.Paste();
doc.SaveAs2(FileName: name);
doc.Close();
return true;
}
It works for most *.msg files but there are some that crash Outlook when I call copy.Content on a Word.Document.
I know you cannot tell me what is wrong with it (or maybe you do?) so I'd like to findit out by myself but the problem is that I am not able to catch the exception. Since a simple try\catch didn't work I tried it with AppDomain.CurrentDomain.UnhandledException this this didn't catch it either.
Are there any other ways to debug it?
The mail that doesn't let me get its content inside a loop doesn't cause any troubles when I open it in a new Outlook window and save it with the same method.
It makes sense to add some delays between Word calls. IO operations takes some time to finish. Also there is no need to create another document in Word for copying the content:
var copy = mail.GetInspector.WordEditor as Word.Document;
copy.Content.Copy();
var doc = word.Documents.Add();
doc.Content.Paste();
doc.SaveAs2(FileName: name);
doc.Close();
Instead, do the required modifications on the original document instance and then save it to the disk. The original mail item will remain unchanged until you call the Save method from the Outlook object model. You may call the Close method passing the olDiscard which discards any changes to the document.
Also consider using the Open XML SDK if you deal with open XML documents only, see Welcome to the Open XML SDK 2.5 for Office for more information.
Do you actually need to use Inspector.WordEditor? You can save the message in a format supported by Word (such as MHTML) using OOM alone by calling MailItem.Save(..., olMHTML) and open the file in Word programmatically to save it in the DOCX format.
I am trying to figure out how to read and write files in a UWA application. I understand that I need to open a FileStreamm, but I can't figure out how to do that.
I started with this code:
FileStream fs = new FileStream(#"C:\XML\test.txt", FileMode.Create, FileAccess.Write);
seems to work, no red lines.
At the end of all of that I am told to put in Flush and Close, like this:
FileStream fs = new FileStream(#"C:\XML\test.txt", FileMode.Create,
...
fs.Flush();
fs.Close();
Now, this is where I hit a snag, because fs.Close(); is not even on the list of functions on fs. I just get a red line in my IDE if I try to hardcode it.
Can someone please take the time to help me understand how to do this with UWA? For some reason it seems like there is a different approach in Windows 10 apps, and I have a VERY hard time finding anything that shows me how to do it right. All the tutorials and SOF forum input are about older versions (non-UWA).
When I do this in a console application it all works as expected.
My end goal is to be able to read and write to an XML file in this kind of fashion:
XDocument doc = XDocument.Load(input);
XElement person = doc.Element("Person");
person.Add(new XElement("Employee",
new XElement("Name", "David"),
new XElement("Dept", "Chef")));
doc.Save(output);
I'm going down this path because an answer to my previous question told me to use a FileStream, but I simply cannot make that work in UWA.
You cannot just access any file from a Universal Windows App. Access to the file system is restricted.
See the documentation for details.
To help you further we need to know more about your application. What kind of files do you want to access for what reason?
Example on how to read an Xml File, modify it and store it in an Universal app. You need a button with the following Click handler and a TextBox named "TextBoxLog".
private async void ButtonDemo_Click(object sender, RoutedEventArgs e)
{
// Get our local storage folder
var localFolder = ApplicationData.Current.LocalFolder;
XmlDocument xmlDocument;
// Try to get file
var file = await localFolder.TryGetItemAsync("MyData.xml") as IStorageFile;
if(file != null)
{
// File exists -> Load into XML document
xmlDocument = await XmlDocument.LoadFromFileAsync(file);
}
else
{
// File does not exist, create new document in memory
xmlDocument = new XmlDocument();
xmlDocument.LoadXml(#"<?xml version=""1.0"" encoding=""UTF-8""?>" + Environment.NewLine + "<root></root>");
}
// Now show the current contents
TextBoxLog.Text = "";
var lEntries = xmlDocument.GetElementsByTagName("Entry");
foreach(var lEntry in lEntries)
{
TextBoxLog.Text += lEntry.InnerText + Environment.NewLine;
}
// Now add a new entry
var node = xmlDocument.CreateElement("Entry");
node.InnerText = DateTime.Now.ToString();
xmlDocument.DocumentElement.AppendChild(node);
// If the file does not exist yet, create it
if(file == null)
{
file = await localFolder.CreateFileAsync("MyData.xml");
}
// Now save the document
await xmlDocument.SaveToFileAsync(file);
}
Okay, the (simple) solution is to put the xml-file in the PROJECTFOLDER/bin/x86/debug/appX and then write the data to a list this way:
public class dataRaw
{
public string data { get; set; }
public string firstName { get; set; }
public string lastName { get; set; }
}
//You can call this class with x = collectionGenerator.getList() (it returns a list<T>)
public class collectionGenerator
{
public static List<dataRaw> getList()
{
//This is the xml file in the folder
var doc = XDocument.Load("Data.xml");
//This parse the XML and adds in to the list "dataList"
var dataList = doc.Root
.Descendants("Person")
.Select(node => new dataRaw
{
//data, firstName and lastName are in app variables from dataRaw put into listData.
//Number, FirstName and LastName are the nodes in the XML file.
data = node.Element("Number").Value,
firstName = node.Element("FirstName").Value,
lastName = node.Element("LastName").Value,
})
.ToList();
return dataList;
}
}
I'm really having trouble in editing bookmarks in a Word template using Document.Format.OpenXML and then saving it to a new PDF file.
I cannot use Microsoft.Word.Interop as it gives a COM error on the server.
My code is this:
public static void CreateWordDoc(string templatePath, string destinationPath, Dictionary<string, dynamic> dictionary)
{
byte[] byteArray = File.ReadAllBytes(templatePath);
using (MemoryStream stream = new MemoryStream())
{
stream.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(stream, true))
{
var bookmarks = (from bm in wordDoc.MainDocumentPart.Document.Body.Descendants<BookmarkStart>()
select bm).ToList();
foreach (BookmarkStart mark in bookmarks)
{
if (mark.Name != "Table" && mark.Name != "_GoBack")
{
UpdateBookmark(dictionary, mark);//Not doing anything
}
else if (mark.Name != "Table")
{
// CreateTable(dictionary, wordDoc, mark);
}
}
File.WriteAllBytes("D:\\RohitDocs\\newfile_rohitsingh.docx", stream.ToArray());
wordDoc.Close();
}
// Save the file with the new name
}
}
private static void UpdateBookmark(Dictionary<string, dynamic> dictionary, BookmarkStart mark)
{
string name = mark.Name;
string value = dictionary[name];
Run run = new Run(new Text(value));
RunProperties props = new RunProperties();
props.AppendChild(new FontSize() { Val = "20" });
run.RunProperties = props;
var paragraph = new DocumentFormat.OpenXml.Wordprocessing.Paragraph(run);
mark.Parent.InsertAfterSelf(paragraph);
paragraph.PreviousSibling().Remove();
mark.Remove();
}
I was trying to replace bookmarks with my text but the UpdateBookmark method doesn't work. I'm writing stream and saving it because I thought if bookmarks are replaced then I can save it to another file.
I think you want to make sure that when you reference mark.Parent that you are getting the correct instance that you are expecting.
Once you get a reference to the correct Paragraph element where your content should go, use the following code to add/swap the run.
// assuming you have a reference to a paragraph called "p"
p.AppendChild<Run>(new Run(new Text(content)) { RunProperties = props });
// and here is some code to remove a run
p.RemoveChild<Run>(run);
To answers the second part of your question, when I did a similar project a few years ago we used iTextSharp to create PDFs from Docx. It worked very well and the API was easy to grok. We even added password encryption and embedded watermarks to the PDFs.
I am working on a project that requires all SQL connection and query information to be stored in XML files. To make my project configurable, I am trying to create a means to let the user configure his sql connection string information (datasource, catalog, username and password) via a series of text boxes. This input will then be saved to the appropriate node within the SQL document.
I can get the current information from the XML file, and display that information within text boxes for the user's review and correction, but I'm encountering an error when it comes time to save the changes.
Here is the code I'm using to update and save the xml document.
protected void submitBtn_Click(object sender, EventArgs e)
{
SPFile file = methods.web.GetFile("MyXMLFile.xml");
myDoc = new XmlDocument();
byte[] bites = file.OpenBinary();
Stream strm1 = new MemoryStream(bites);
myDoc.Load(strm1);
XmlNode node;
node = myDoc.DocumentElement;
foreach (XmlNode node1 in node.ChildNodes)
{
foreach (XmlNode node2 in node1.ChildNodes)
{
if (node2.Name == "name1")
{
if (node2.InnerText != box1.Text)
{
}
}
if (node2.Name == "name2")
{
if (node2.InnerText != box2.Text)
{
}
}
if (node2.Name == "name3")
{
if (node2.InnerText != box3.Text)
{
node2.InnerText = box3.Text;
}
}
if (node2.Name == "name4")
{
if (node2.InnerText != box4.Text)
{
}
}
}
}
myDoc.Save(strm1);
}
Most of the conditionals are empty at this point because I'm still testing.
The code works great until the last line, as I said. At that point, I get the error "Memory Stream is not expandable." I understand that using a memory stream to update a stored file is incorrect, but I can't figure out the right way to do this.
I've tried to implement the solution given in the similar question at Memory stream is not expandable but that situation is different from mine and so the implementation makes no sense to me. Any clarification would be greatly appreciated.
Using the MemoryStream constructor that takes a byte array as an argument creates a non-resizable instance of a MemoryStream. Since you are making changes to the file (and therefore the underlying bytes), you need a resizable MemoryStream. This can be accomplished by using the parameterless constructor of the MemoryStream class and writing the byte array into the MemoryStream.
Try this:
SPFile file = methods.web.GetFile("MyXMLFile.xml");
myDoc = new XmlDocument();
byte[] bites = file.OpenBinary();
using(MemoryStream strm1 = new MemoryStream()){
strm1.Write(bites, 0, (int)bites.Length);
strm1.Position = 0;
myDoc.Load(strm1);
// all of your edits to the file here
strm1.Position = 0;
// save the file back to disk
using(var fs = new FileStream("FILEPATH",FileMode.Create,FileAccess.ReadWrite)){
myDoc.Save(fs);
}
}
To get the FILEPATH for a Sharepoint file, it'd be something along these lines (I don't have a Sharepoint development environment set up right now):
SPFile file = methods.web.GetFile("MyXMLFile.xml")
var filepath = file.ParentFolder.ServerRelativeUrl + "\\" + file.Name;
Or it might be easier to just use the SaveBinary method of the SPFile class like this:
// same code from above
// all of your edits to the file here
strm1.Position = 0;
// don't use a FileStream, just SaveBinary
file.SaveBinary(strm1);
I didn't test this code, but I've used it in Sharepoint solutions to modify XML (mainly OpenXML) documents in Sharepoint lists. Read this blogpost for more information
You could look into using the XDocument class instead of XmlDocument class.
http://msdn.microsoft.com/en-us/library/system.xml.linq.xdocument.aspx
I prefer it because of the simplicity and it eliminates having to use Memory Stream.
Edit: You can append to the file like this:
XDocument doc = XDocument.Load('filePath');
doc.Root.Add(
new XElement("An Element Name",
new XAttribute("An Attribute", "Some Value"),
new XElement("Nested Element", "Inner Text"))
);
doc.Save(filePath);
Or you can search for an element and update like this:
doc.Root.Elements("The element").First(m =>
m.Attribute("An Attribute").Value == "Some value to match").SetElementValue(
"The element to change", "Value to set element to");
doc.Save('filePath');