I am looking for alternatives to using openxml for a server-side word automation project. Does anyone know any other ways that have features to let me manipulate word bookmarks and tables?
I am currently doing a project of developing a word automation project for my company and I am using DocX Very simple and straight forward API to work with. The approach I am using is, whenever I need to work with XML directly, this API has a property named "xml" in the Paragraph class which gives you access to the underlying xml direclty so that I can work with it. The best part is its not breaking the xml and not corrupting the resulting document. Hope this helps!
Example code using DocX..
XNamespace ns = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
using(DocX doc = DocX.Load(#"c:\temp\yourdoc.docx"))
{
foreach( Paragraph para in doc.Paragraphs )
{
if(para.Xml.ToString().Contains("w:Bookmark"))
{
if(para.Xml.Element(ns + "BookmarkStart").Attribute("Name").Value == "yourbookmarkname")
{
// you got to your bookmark, if you want to change the text..then
para.Xml.Elements(ns + "t").FirstOrDefault().SetValue("Text to replace..");
}
}
}
}
Alternative API exclusively to work with bookmarks is .. http://simpleooxml.codeplex.com/
Example on how to delete text from bookmarkstart to bookmarkend using this API..
MemoryStream stream = DocumentReader.Copy(string.Format("{0}\\template.docx", TestContext.TestDeploymentDir));
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
MainDocumentPart mainPart = doc.MainDocumentPart;
DocumentWriter writer = new DocumentWriter(mainPart);
//Simply Clears all text between bookmarkstart and end
writer.PasteText("", "YourBookMarkName");
//Save to the memory stream, and then to a file
writer.Save();
DocumentWriter.StreamToFile(string.Format("{0}\\templatetest.docx", GetOutputFolder()), stream);
Loading the word document into different API's from memory stream.
//Loading a document file into memorystream using SimpleOOXML API
MemoryStream stream = DocumentReader.Copy(#"c\template.docx");
//Opening it from the memory stream as OpenXML document
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
//Opening it as DocX document for working with DocX Api
DocX document = DocX.Load(stream);
Related
I am trying to write code to fill a template's content controls then save it as a new file.
I found this very helpful entry Word OpenXml Word Found Unreadable Content And used the code there.
I copied the code from that post as shown below
public static MemoryStream ReadAllBytesToMemoryStream(string path)
{
byte[] buffer = File.ReadAllBytes(path);
var destStream = new MemoryStream(buffer.Length);
destStream.Write(buffer, 0, buffer.Length);
destStream.Seek(0, SeekOrigin.Begin);
return destStream;
}
public static void Generate()
{
using MemoryStream stream = ReadAllBytesToMemoryStream(#"c:\Templates\TemplateTest.dotx");
using (WordprocessingDocument wpd = WordprocessingDocument.Open(stream, true))
{
wpd.ChangeDocumentType(WordprocessingDocumentType.Document);
}
File.WriteAllBytes(#"c:\Templates\TemplateTestOutput.docx", stream.GetBuffer());
return;
}
It successfully creates the file, but the problem is, whenever I open the new docx file, it gives the "Word Found Unreadable Content" error. The template I made isn't complex, it just has 3 content controls with regular text for labels. I also tried copying a regular docx with just some lines of text, same error.
Whenever I click ok, on the Word Found Unreadable Content error, it shows the document just fine. I'm not sure what I'm doing wrong, I'm not even editing anything at this point.
Figured out a solution.
Instead of using File.WriteAllBytes
File.WriteAllBytes(#"C:\\Templates\TemplateTestOutput.docx", stream.GetBuffer());
I used the following code:
using (FileStream fileStream = new FileStream(#"C:\\Templates\TemplateTestOutput.docx", System.IO.FileMode.CreateNew))
{
stream.WriteTo(fileStream);
}
I have created a program to read a file as array of bytes. The program is consuming word files by using docx library from Xceed. What I want to do is to recreate the parsed docx file from array of bytes.
To bytes:
var doc = Docx.Load("afile.docx");
...
return Encoding.Unicode.GetBytes(doc.Xml.Document.ToString());
Parse:
var doc = Docx.Create("anotherFile.docx");
var document = Encoding.Unicode.GetBytes({--returned bytes--}); <-- document is string with xml
How to save the document like the original?
I'm getting only blank file without any content.
using (var doc = DocX.Load("afile.docx"))
{
//here modify
doc.SaveAs("anotherFile.docx");
}
See this document BinaryWriter
bWriter.Writebytes(bytearray);
My asp.net c# web-application is creating word documents by filling an existing template word document with data. Now I need to add a further existing documents to that document as next page.
For example: My template has two pages. The document I need to append has one page. As result I want to get one word document with 3 pages.
How do I append documents to an existing word document in asp.net/c# with the Microsoft Open XML SDK 2.0?
Use this code to merge two documents
using System.Linq;
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
namespace altChunk
{
class Program
{
static void Main(string[] args)
{
string fileName1 = #"c:\Users\Public\Documents\Destination.docx";
string fileName2 = #"c:\Users\Public\Documents\Source.docx";
string testFile = #"c:\Users\Public\Documents\Test.docx";
File.Delete(fileName1);
File.Copy(testFile, fileName1);
using (WordprocessingDocument myDoc =
WordprocessingDocument.Open(fileName1, true))
{
string altChunkId = "AltChunkId1";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
AlternativeFormatImportPart chunk =
mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (FileStream fileStream = File.Open(fileName2, FileMode.Open))
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body
.Elements<Paragraph>().Last());
mainPart.Document.Save();
}
}
}
}
This works flawlessly and the same code is also available here.
There is another approach that uses Open XML PowerTools
Can anybody show me how to apply Fontfamily/size to an AltChunk of Type
AlternativeFormatImportPartType.TextPlain
This is my Code, but I can´t figure out how to do this at all (even Google doesn´t help)
MainDocumentPart main = doc.MainDocumentPart;
string altChunkId = "AltChunkId" + Guid.NewGuid().ToString().Replace("-", "");
var chunk = main.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.TextPlain, altChunkId);
using (var mStream = new MemoryStream())
{
using (var writer = new StreamWriter(mStream))
{
writer.Write(value);
writer.Flush();
mStream.Position = 0;
chunk.FeedData(mStream);
}
}
var altChunk = new AltChunk();
altChunk.Id = altChunkId;
OpenXmlElement afterThat = null;
foreach (var para in main.Document.Body.Descendants<Paragraph>())
{
if (para.InnerText.Equals("Notizen:"))
{
afterThat = para;
}
}
main.Document.Body.InsertAfter(altChunk, afterThat);
if I do it this way I get "Courier New" with a Size of "10,5"
UPDATE
This is the working Solution I came up with:
Convert Plaintext to RTF, change the Fontfamily/size and apply it to the WordProcessingDocument!
public static string PlainToRtf(string value)
{
using (var rtf = new System.Windows.Forms.RichTextBox())
{
rtf.Text = value;
rtf.SelectAll();
rtf.SelectionFont = new System.Drawing.Font("Calibri", 10);
return rtf.Rtf;
}
}
var chunk = main.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.Rtf, altChunkId);
using (var mStream = new MemoryStream())
{
using (var writer = new StreamWriter(mStream))
{
var rtf = PlainToRtf(value);
writer.Write(rtf);
writer.Flush();
mStream.Position = 0;
chunk.FeedData(mStream);
}
}
//proceed with creating AltChunk and inserting it to the Document...
How to apply FontFamily/Size to AltChunk of Type [TextPlain]
I am afraid this is NOT possible, in any case, not with OpenXml SDK.
Why?
altChunk (Anchor for Imported External Content) object is further designed for importing content in the document. They are 'temporary' objects: it is a just a reference to an external content, that is incorporated "as is" in the document, and then, when the document will be opened and saved with Word, Word converts this external content in valid OpenXml content.
So you can't, for a newly created document, loop into the paragraphs in order to retrieve it and apply a style.
If you import rtf content for example, the style must be applied to rtf before importing it.
In case of plain text TextPlain (= Text file .txt), there is no style conversion (there is no style attached to the text file, you can change the font in NotePad, it will apply to all documents, this is an Application Level property).
And I can confirm that Word creates by default a style with "Courier New 10,5" to display the content of the file. I just tested.
What can I do?
Apply style after the document has been open/saved with Word. Note you will have to retreive the paragrap(s), or you could try to retrieve the style created in the document and change the font here. This link could help to achieve this:
How to: Apply a style to a paragraph in a word processing document (Open XML SDK).
Or maybe it exists(?) a registry key something Like this that you can change to change Word's default behavior on your computer. And even if it is, it doesn't solve the problem for newly created document which is opened the first time on the client.
Note from the OP:
I think a possible Solution to the Problem could be, converting the PlainText to RTF apply StyleInformation and then append it to WordProcessingDocument as AltChunk.
I totally agreed. Just note when he says apply StyleInformation, it means at rtf level.
I've created a word document docx and am now in the process of "reverse engineering" it with the OpenXml productivity tool. I'm using .FeedData() to feed the styles, theme etc. into the document, which I'm saving as .xml files from the reflection - and all was going well, until I came to the footer.
Here's what I'm doing for the styles (this works perfectly fine):
StyleDefinitionsPart styleDefinitionsPart = mainPart.AddNewPart<StyleDefinitionsPart>();
using (FileStream fs = new FileStream(Server.MapPath("styles.xml"), FileMode.Open, FileAccess.Read))
{
styleDefinitionsPart.FeedData(fs);
}
Looking at my document, everything is there - now I reflect on the footer part, save the xml to footer.xml and add the part like this:
FooterPart footerPart = mainPart.AddNewPart<FooterPart>();
using (FileStream fs = new FileStream(Server.MapPath("footer.xml"), FileMode.Open, FileAccess.Read))
{
footerPart.FeedData(fs);
}
Everything else looks fine, I can see the part in my document, but the footer just isn't appearing ON the document, what am I doing wrong here? Do I need to tell the document which footer part to use or something?
Fixed. It seems there has to be some content in the body before you can add a footer and header to it, then you need to add a reference to them for each section like this:
foreach (var section in mainPart.Document.Body.Elements<WP.SectionProperties>())
{
section.PrependChild<WP.HeaderReference>(new WP.HeaderReference() { Id = mainPart.GetIdOfPart(headerPart) });
section.PrependChild<WP.FooterReference>(new WP.FooterReference() { Id = mainPart.GetIdOfPart(footerPart) });
}
Remember to add your header and footer last, so there is content in the document.
HUGE thanks to this answer: Add Header and Footer to an existing empty word document with OpenXML SDK 2.0