I've created a word document docx and am now in the process of "reverse engineering" it with the OpenXml productivity tool. I'm using .FeedData() to feed the styles, theme etc. into the document, which I'm saving as .xml files from the reflection - and all was going well, until I came to the footer.
Here's what I'm doing for the styles (this works perfectly fine):
StyleDefinitionsPart styleDefinitionsPart = mainPart.AddNewPart<StyleDefinitionsPart>();
using (FileStream fs = new FileStream(Server.MapPath("styles.xml"), FileMode.Open, FileAccess.Read))
{
styleDefinitionsPart.FeedData(fs);
}
Looking at my document, everything is there - now I reflect on the footer part, save the xml to footer.xml and add the part like this:
FooterPart footerPart = mainPart.AddNewPart<FooterPart>();
using (FileStream fs = new FileStream(Server.MapPath("footer.xml"), FileMode.Open, FileAccess.Read))
{
footerPart.FeedData(fs);
}
Everything else looks fine, I can see the part in my document, but the footer just isn't appearing ON the document, what am I doing wrong here? Do I need to tell the document which footer part to use or something?
Fixed. It seems there has to be some content in the body before you can add a footer and header to it, then you need to add a reference to them for each section like this:
foreach (var section in mainPart.Document.Body.Elements<WP.SectionProperties>())
{
section.PrependChild<WP.HeaderReference>(new WP.HeaderReference() { Id = mainPart.GetIdOfPart(headerPart) });
section.PrependChild<WP.FooterReference>(new WP.FooterReference() { Id = mainPart.GetIdOfPart(footerPart) });
}
Remember to add your header and footer last, so there is content in the document.
HUGE thanks to this answer: Add Header and Footer to an existing empty word document with OpenXML SDK 2.0
Related
I am trying to write code to fill a template's content controls then save it as a new file.
I found this very helpful entry Word OpenXml Word Found Unreadable Content And used the code there.
I copied the code from that post as shown below
public static MemoryStream ReadAllBytesToMemoryStream(string path)
{
byte[] buffer = File.ReadAllBytes(path);
var destStream = new MemoryStream(buffer.Length);
destStream.Write(buffer, 0, buffer.Length);
destStream.Seek(0, SeekOrigin.Begin);
return destStream;
}
public static void Generate()
{
using MemoryStream stream = ReadAllBytesToMemoryStream(#"c:\Templates\TemplateTest.dotx");
using (WordprocessingDocument wpd = WordprocessingDocument.Open(stream, true))
{
wpd.ChangeDocumentType(WordprocessingDocumentType.Document);
}
File.WriteAllBytes(#"c:\Templates\TemplateTestOutput.docx", stream.GetBuffer());
return;
}
It successfully creates the file, but the problem is, whenever I open the new docx file, it gives the "Word Found Unreadable Content" error. The template I made isn't complex, it just has 3 content controls with regular text for labels. I also tried copying a regular docx with just some lines of text, same error.
Whenever I click ok, on the Word Found Unreadable Content error, it shows the document just fine. I'm not sure what I'm doing wrong, I'm not even editing anything at this point.
Figured out a solution.
Instead of using File.WriteAllBytes
File.WriteAllBytes(#"C:\\Templates\TemplateTestOutput.docx", stream.GetBuffer());
I used the following code:
using (FileStream fileStream = new FileStream(#"C:\\Templates\TemplateTestOutput.docx", System.IO.FileMode.CreateNew))
{
stream.WriteTo(fileStream);
}
I need to remove the first few pages of a PDF file. Apparently, the easiest way to do that is to create a copy of it and not duplicate the unwanted pages. This works, but they look a lot smaller than they should. Any ideas?
How it should look
How it actually looks
private static void ClipSpecificPDF(string input, string output, int pagesToCut)
{
PdfReader myReader = new PdfReader(input);
using (FileStream fs = new FileStream(output, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document())
{
using (PdfWriter myWriter = PdfWriter.GetInstance(doc, fs))
{
//Open the desitination for writing
doc.Open();
//Loop through each page that we want to keep
for (int i = pagesToCut; i < myReader.NumberOfPages; i++)
{
//Add a new blank page to destination document
var PS = myReader.GetPageSizeWithRotation(i);
myWriter.SetPageSize(PS);
doc.NewPage();
//Extract the given page from our reader and add it directly to the destination PDF
myWriter.DirectContent.AddTemplate(myWriter.GetImportedPage(myReader, i + 1), 0, 0);
}
//Close our document
doc.Close();
}
}
}
}
The problem you describe is explained in the FAQ. For instance in the answer to the questions:
How to merge documents correctly?
Why does the function to concatenate / merge PDFs cause issues in some cases?
Using PdfWriter to manipulate PDF documents is a very bad idea. Read chapter 6 of my book to discover why this is a bad idea, and take a look at Table 6.1 to find out which class is a better fit.
In the same chapter, you'll find the SelectPages example. Suppose that you want to create a new PDF containing only page 4 to 8. In that case, you simply use the SelectPages() method and PdfStamper:
PdfReader reader = new PdfReader(src);
reader.SelectPages("4-8");
PdfStamper stamper = new PdfStamper(reader, new FileStream(dest, FileMode.Create, FileAccess.Write));
stamper.Close();
reader.Close();
By using PdfReader, the page size is preserved, as well as any of the interactive features that may be present.
Your approach is bad because you do not respect the original page size: you copy a document with letter (?) format to a document with A4 pages. If the origin of the page doesn't correspond with the lower-left corner, parts of your document will be invisible. If there are interactive features in your PDF, they will be lost. Of all the possible examples you could have followed, you picked the worst one...
Can anybody show me how to apply Fontfamily/size to an AltChunk of Type
AlternativeFormatImportPartType.TextPlain
This is my Code, but I can´t figure out how to do this at all (even Google doesn´t help)
MainDocumentPart main = doc.MainDocumentPart;
string altChunkId = "AltChunkId" + Guid.NewGuid().ToString().Replace("-", "");
var chunk = main.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.TextPlain, altChunkId);
using (var mStream = new MemoryStream())
{
using (var writer = new StreamWriter(mStream))
{
writer.Write(value);
writer.Flush();
mStream.Position = 0;
chunk.FeedData(mStream);
}
}
var altChunk = new AltChunk();
altChunk.Id = altChunkId;
OpenXmlElement afterThat = null;
foreach (var para in main.Document.Body.Descendants<Paragraph>())
{
if (para.InnerText.Equals("Notizen:"))
{
afterThat = para;
}
}
main.Document.Body.InsertAfter(altChunk, afterThat);
if I do it this way I get "Courier New" with a Size of "10,5"
UPDATE
This is the working Solution I came up with:
Convert Plaintext to RTF, change the Fontfamily/size and apply it to the WordProcessingDocument!
public static string PlainToRtf(string value)
{
using (var rtf = new System.Windows.Forms.RichTextBox())
{
rtf.Text = value;
rtf.SelectAll();
rtf.SelectionFont = new System.Drawing.Font("Calibri", 10);
return rtf.Rtf;
}
}
var chunk = main.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.Rtf, altChunkId);
using (var mStream = new MemoryStream())
{
using (var writer = new StreamWriter(mStream))
{
var rtf = PlainToRtf(value);
writer.Write(rtf);
writer.Flush();
mStream.Position = 0;
chunk.FeedData(mStream);
}
}
//proceed with creating AltChunk and inserting it to the Document...
How to apply FontFamily/Size to AltChunk of Type [TextPlain]
I am afraid this is NOT possible, in any case, not with OpenXml SDK.
Why?
altChunk (Anchor for Imported External Content) object is further designed for importing content in the document. They are 'temporary' objects: it is a just a reference to an external content, that is incorporated "as is" in the document, and then, when the document will be opened and saved with Word, Word converts this external content in valid OpenXml content.
So you can't, for a newly created document, loop into the paragraphs in order to retrieve it and apply a style.
If you import rtf content for example, the style must be applied to rtf before importing it.
In case of plain text TextPlain (= Text file .txt), there is no style conversion (there is no style attached to the text file, you can change the font in NotePad, it will apply to all documents, this is an Application Level property).
And I can confirm that Word creates by default a style with "Courier New 10,5" to display the content of the file. I just tested.
What can I do?
Apply style after the document has been open/saved with Word. Note you will have to retreive the paragrap(s), or you could try to retrieve the style created in the document and change the font here. This link could help to achieve this:
How to: Apply a style to a paragraph in a word processing document (Open XML SDK).
Or maybe it exists(?) a registry key something Like this that you can change to change Word's default behavior on your computer. And even if it is, it doesn't solve the problem for newly created document which is opened the first time on the client.
Note from the OP:
I think a possible Solution to the Problem could be, converting the PlainText to RTF apply StyleInformation and then append it to WordProcessingDocument as AltChunk.
I totally agreed. Just note when he says apply StyleInformation, it means at rtf level.
I am looking for alternatives to using openxml for a server-side word automation project. Does anyone know any other ways that have features to let me manipulate word bookmarks and tables?
I am currently doing a project of developing a word automation project for my company and I am using DocX Very simple and straight forward API to work with. The approach I am using is, whenever I need to work with XML directly, this API has a property named "xml" in the Paragraph class which gives you access to the underlying xml direclty so that I can work with it. The best part is its not breaking the xml and not corrupting the resulting document. Hope this helps!
Example code using DocX..
XNamespace ns = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
using(DocX doc = DocX.Load(#"c:\temp\yourdoc.docx"))
{
foreach( Paragraph para in doc.Paragraphs )
{
if(para.Xml.ToString().Contains("w:Bookmark"))
{
if(para.Xml.Element(ns + "BookmarkStart").Attribute("Name").Value == "yourbookmarkname")
{
// you got to your bookmark, if you want to change the text..then
para.Xml.Elements(ns + "t").FirstOrDefault().SetValue("Text to replace..");
}
}
}
}
Alternative API exclusively to work with bookmarks is .. http://simpleooxml.codeplex.com/
Example on how to delete text from bookmarkstart to bookmarkend using this API..
MemoryStream stream = DocumentReader.Copy(string.Format("{0}\\template.docx", TestContext.TestDeploymentDir));
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
MainDocumentPart mainPart = doc.MainDocumentPart;
DocumentWriter writer = new DocumentWriter(mainPart);
//Simply Clears all text between bookmarkstart and end
writer.PasteText("", "YourBookMarkName");
//Save to the memory stream, and then to a file
writer.Save();
DocumentWriter.StreamToFile(string.Format("{0}\\templatetest.docx", GetOutputFolder()), stream);
Loading the word document into different API's from memory stream.
//Loading a document file into memorystream using SimpleOOXML API
MemoryStream stream = DocumentReader.Copy(#"c\template.docx");
//Opening it from the memory stream as OpenXML document
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
//Opening it as DocX document for working with DocX Api
DocX document = DocX.Load(stream);
I have an existing PDF document named as aa.pdf. This PDF document has 3 pages. I'd like to add a PDF form field (or a text) at the page bottom of the first page in aa.pdf using iTextSharp.
Meanwhile, I also hope that the PDF form field added (or the text added) can link into another page of aa.pdf. For example, after I click the PDF form field (or the text) located in the first page of aa.pdf,this PDF document skips into the second page.
How can I realize the aboved functionalities using iTextSharp?
Thanks.
To create links within a PDF you use a PdfAction which can be set on a Chunk which can optionally be added to a Paragraph. There are several different types of actions that you can choose from, the two that you are probably interested in are the NEXTPAGE action and/or the GotoLocalPage action. The first item does what it says and goes to the next page. This one is nice because you don't have to worry about figuring out what page number you are on. The second item allows you to specify the specific page number to go to. In its simplest form you can do:
Chunk ch = new Chunk("Go to next page").SetAction(new PdfAction(PdfAction.NEXTPAGE));
This creates a Chunk that you can add in whatever way you want. When working with an existing PDF there's several different ways to add text to a page. One way it to use a ColumnText object which has a method called SetSimpleColumn that allows you to define a simple rectangle that you can add elements to.
Lastly, PDF readers don't automatically treat links differently within a PDF except to give a different cursor when hovering. More specifically, unlike a webpage where hyperlinks are turned a different color, PDFs don't change the color of links unless you tell them to, so this should be kept in mind when creating them. Also, when modifying a PDF you generally never want to overwrite the existing PDF during the process because that would be writing to something that your reading from. Sometimes it works, more often then not it breaks, sometimes subtly. Instead, write to a second file and when you are completely done, erase the first file and rename the second file.
The code below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.2.0. The first part of the code creates a sample PDF called "aa.pdf" on the desktop. If you already have that file you can comment this section out but its in here so others can reproduce this example. The second part creates a new file called "bb.pdf" based on "aa.pdf". It adds two text links to the bottom of the first page. The first link advances the PDF to just the next page while the second link advances the PDF to a specific page number. See the comments in the code for specific implementation details.
using System;
using System.IO;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.pdf;
namespace WindowsFormsApplication1 {
public partial class Form1 : Form {
public Form1() {
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e) {
//Files that we'll be working with
string inputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "aa.pdf");
string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "bb.pdf");
//Create a standard PDF to test with, nothing special here
using (FileStream fs = new FileStream(inputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Create 10 pages with labels on each page
for (int i = 1; i <= 10; i++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("This is page {0}", i)));
}
doc.Close();
}
}
}
//For the OP, this is where you would start
//Declare some variables to be used later
ColumnText ct;
Chunk c;
//Bind a reader to the input file
PdfReader reader = new PdfReader(inputFile);
//PDFs don't automatically make hyperlinks a special color so we're specifically creating a blue font to use here
iTextSharp.text.Font BlueFont = FontFactory.GetFont("Arial", 12, iTextSharp.text.Font.NORMAL, iTextSharp.text.BaseColor.BLUE);
//Create our new file
using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
//Bind a stamper to our reader and output file
using (PdfStamper stamper = new PdfStamper(reader, fs)) {
Chunk ch = new Chunk("Go to next page").SetAction(new PdfAction(PdfAction.NEXTPAGE));
//Get the "over" content for page 1
PdfContentByte cb = stamper.GetOverContent(1);
//This example adds a link that goes to the next page
//Create a ColumnText object
ct = new ColumnText(cb);
//Set the rectangle to write to
ct.SetSimpleColumn(0, 0, 200, 20);
//Add some text and make it blue so that it looks like a hyperlink
c = new Chunk("Go to next page", BlueFont);
//Set the action to go to the next page
c.SetAction(new PdfAction(PdfAction.NEXTPAGE));
//Add the chunk to the ColumnText
ct.AddElement(c);
//Tell the system to process the above commands
ct.Go();
//This example add a link that goes to a specific page number
//Create a ColumnText object
ct = new ColumnText(cb);
//Set the rectangle to write to
ct.SetSimpleColumn(200, 0, 400, 20);
//Add some text and make it blue so that it looks like a hyperlink
c = new Chunk("Go to page 3", BlueFont);
//Set the action to go to a specific page number. This option is a little more complex, you also have to specify how you want to "fit" the document
c.SetAction(PdfAction.GotoLocalPage(3, new PdfDestination(PdfDestination.FIT), stamper.Writer));
//Add the chunk to the ColumnText
ct.AddElement(c);
//Tell the system to process the above commands
ct.Go();
}
}
this.Close();
}
}
}