I am using iTextSharp (5.5.5.90) to generate PDF files. I am using paragraphs and importing pages from readers and such. Here is how I create my document, from there I just append what I need:
FileStream fs = new FileStream("filename.pdf", FileMode.Create, FileAccess.Write, FileShare.None);
Document doc = new Document(new Rectangle(PageSize.LETTER), 58, 58, 100, 50);
PdfWriter writer = PdfWriter.GetInstance(doc, fs);
Once the file is created, I add paragraphs like this:
doc.Add(new Paragraph("Paragraph text"));
And import pages from readers like this:
writer.DirectContent.AddTemplate(writer.GetImportedPage(reader, page), 0, 0);
My question is how would I go back to page one after generating the entire document and add an element to page one? I will be adding a barcode (I know how to add barcodes, tables and such where I want them on the current page), but I don't know how to "go back" to page one to add an element.
Here is the full code, but you won't be able to compile it because of dependencies. Also, don't get caught up in the details of the full code, as this is a large project to create dynamically generated documents. https://pastebin.com/kABi7fzW
I can't attest as to the exactly calls to iTextSharp's as we approached our documentation quite differently; open a Word template, at the data as DataTables, etc., do a MailMerge, close and reopen and save as PDF. Sounds more involved but doesn't require the granular level of detail you're doing of creating the document paragraph by paragraph but it does allow the document generator to worry about content and not style placement (handled via Word, manually and external to the application).
From experience with iTextSharp, you'll have a lot of trouble trying to float an element on top of a section to insert the barcode. The document generation tool has an annoying tendency to not quite work in this scenario. We endured many weeks of back and forth with iTextSharp support and a version upgrade and still couldn't get it to behave properly in all scenarios.
As discussed in the comments and given how you've already written your code (I doubt you'll scrap all that code and start with a MailMerge unless you really, really have to), you'll need to insert a placeholder block that you can locate via iTextSharp's PdfBuilder api. I'd imagine that setting a bookmark location would likely be the easiest way.
If it's possible (preferable?) to have the barcode on a page of its own, then you already have the code needed to do this (circa line 324 in your pastebin link) with;
// create doc...
// reopen doc and get page count
doc.NewPage();
// add barcode with page count + 1
// save
Related
I've been attempting to find an easy solution to exporting a Canvas in my WPF Application to a PDF Document.
So far, the best solution has been to use the PrintDialog and set it up to automatically use the Microsoft Print the PDF 'printer'. The only problem I have had with this is that although the PrintDialog is skipped, there is a FileDialog to choose where the file should be saved.
Sadly, this is a deal-breaker because I would like to run this over a large number of canvases with automatically generated PDF names (well, programitically provided anyway).
Other solutions I have looked at include:
Using PrintDocument, but from my experimentation I would have to manually iterate through all my Canveses children and manually invoke the correct Draw method (of which a lot of my custom elements with transformation would be rather time consuming to do)
Exporting as a PNG image and then embedding that in a PDF. Although this works, TextBlocks within my canvas are no longer text. So this isn't an ideal situation.
Using the 3rd party library PDFSharp has the same downfall as the PrintDocument. A lot of custom logic for each element.
With PDFSharp. I did find a method fir generating the XGraphics from a Canvas but no way of then consuming that object to make a PDF Page
So does anybody know how I can skip or automate the PDF PrintDialog, or consume PDFSharp XGraphics to make
A page. Or any other ideas for directions to take this besides writing a whole library to convert each of my Canvas elements to PDF elements.
If you look at the output port of a recent windows installation of Microsoft Print To PDF
You may note it is set to PORTPROMP: and that is exactly what causes the request for a filename.
You might note lower down, I have several ports set to a filename, and the fourth one down is called "My Print to PDF"
So very last century methodology; when I print with a duplicate printer but give it a different name I can use different page ratios etc., without altering the built in standard one. The output for a file will naturally be built:-
A) Exactly in one repeatable location, that I can file monitor and rename it, based on the source calling the print sequence, such that if it is my current default printer I can right click files to print to a known \folder\file.pdf
B) The same port can be used via certain /pt (printto) command combinations to output, not just to that default port location, but to a given folder\name such as
"%ProgramFiles%\Windows NT\Accessories\WORDPAD.EXE" /pt listIN.doc "My Print to PDF" "My Print to PDF" "listOUT.pdf"
Other drivers usually charge for the convenience of WPF programmable renaming, but I will leave you that PrintVisual challenge for another of your three wishes.
MS suggest XPS is best But then they would be promoting it as a PDF competitor.
It does not need to be Doc[X]2PDF it could be [O]XPS2PDF or aPNG2PDF or many pages TIFF2PDF etc. etc. Any of those are Native to Win 10 also other 3rd party apps such as [Free]Office with a PrintTo verb will do XLS[X]2PDF. Imagination becomes pagination.
I had a great success in generating PDFs using PDFSharp in combination with SkiaSharp (for more advanced graphics).
Let me begin from the very end:
you save the PdfDocument object in the following way:
PdfDocument yourDocument = ...;
string filename = #"your\file\path\document.pdf"
yourDocument.Save(filename);
creating the PdfDocument with a page can be achieved the following way (adjust the parameters to fit your needs):
PdfDocument yourDocument = new PdfDocument();
yourDocument.PageLayout = PdfPageLayout.SinglePage;
yourDocument.Info.Title = "Your document title";
PdfPage yourPage = yourDocument.AddPage();
yourDocument.Orientation = PageOrientation.Landscape;
yourDocument.Size = PageSize.A4;
the PdfPage object's content (as an example I'm putting a string and an image) is filled in the following way:
using (XGraphics gfx = XGraphics.FromPdfPage(yourPage))
{
XFont yourFont = new XFont("Helvetica", 20, XFontStyle.Bold);
gfx.DrawString(
"Your string in the page",
yourFont,
XBrushes.Black,
new XRect(0, XUnit.FromMillimeter(10), page.Width, yourFont.GetHeight()),
XStringFormats.Center);
using (Stream s = new FileStream(#"path\to\your\image.png", FileMode.Open))
{
XImage image = XImage.FromStream(s);
var imageRect = new XRect()
{
Location = new XPoint() { X = XUnit.FromMillimeter(42), Y = XUnit.FromMillimeter(42) },
Size = new XSize() { Width = XUnit.FromMillimeter(42), Height = XUnit.FromMillimeter(42.0 * image.PixelHeight / image.PixelWidth) }
};
gfx.DrawImage(image, imageRect);
}
}
Of course, the font objects can be created as static members of your class.
And this is, in short to answer your question, how you consume the XGraphics object to create a PDF page.
Let me know if you need more assistance.
I have a .net Core 3.1 webapi.
I want to get the number of pages are in docs and doc file.
Am using the Syncfusion.DocIO.Net.Core package to perform operation on docs/doc file. But it does't provide feature to update stat of file and display only the PageCount: document.BuiltinDocumentProperties.PageCount;
This is not updated by files. Would you someone suggest me how i can calculate arithmetically.
You can get the page count from BuiltInProperties of the document using DocIO as below. It shows the page count in the document while creating the document using Microsoft Word application and also it still returns the same page count if we manipulate the document using DoIO.
int count = document.BuiltinDocumentProperties.PageCount;
If you want to get the page count, after manipulating the Word document using DocIO, we suggest you to convert the word document to PDF, and then you can retrieve the page count as like below.
WordDocument wordDocument = new WordDocument(fileStream, FormatType.Docx);
DocIORenderer render = new DocIORenderer();
//Sets Chart rendering Options.
render.Settings.ChartRenderingOptions.ImageFormat = ExportImageFormat.Jpeg;
//Converts Word document into PDF document
PdfDocument pdfDocument = render.ConvertToPDF(wordDocument);
int pageCount = pdfDocument.PageCount;
Since Word document is a flow document in which contents will not be preserved page by page; instead the contents will be preserved sequentially section by section. Each section may extend to various pages based on its contents like table, text, images etc.
Whereas Essential DocIO is a non-UI component that provides a full-fledged document object model to manipulate the Word document contents. Hence it is not feasible to get the page count directly from Word document using DocIO.
We have prepared the sample application to get the page counts from the word document and it can be downloaded from the below link.
https://www.syncfusion.com/downloads/support/directtrac/general/ze/GetPageCount16494613
I am using WordprocessingDocument to Read and write content to a word document but when I am opening the document using MemoryStream, it is not showing me the images and header/footer which is already in the word document. Below is the code for the same.
private void AddReport(MainDocumentPart parent, MemoryStream report)
{
using (MemoryStream editingMemoryStream = new MemoryStream())
{
report.Position = 0;
report.CopyTo(editingMemoryStream);
editingMemoryStream.Position = 0;
using (WordprocessingDocument newDoc = WordprocessingDocument.Open(editingMemoryStream, true))
{
WP.Body Template = newDoc.MainDocumentPart.Document.Body;
var Main = newDoc.MainDocumentPart;
var cloneTemplate = Template.CloneNode(true);
parent.Document.Body.PrependChild(new WP.Paragraph(new WP.Run(cloneTemplate)));
parent.Document.Save();
}
}
}
Screenshot for the word document:
enter image description here
In this, the Parent document is the document where I am pre-pending the above document. Any help will be appreciated. Thanks in advance.
The headers, footers and images are not part of the document body, so won't be carried over to another document in the described scenario.
All this information is stored in separate "xml parts" contained within the Word file's "zip package". The Body part contains only refereces (relationship IDs listed in a "rel" part that point/link to the relevant xml part contained in the package).
This can be seen by opening the document in the Open XML SDK Productivity Tool and inspecting the underlying Word Open XML.
In order to copy such content to another document it's necessary to clone not only the body, but also each and every relevant xml part with content you want to have, while dynamically generating the necessary relationships - not a trivial undertaking. There are posts here and elsewhere in the Internet (including my blog, the WordMeister) that demonstrate the basics of how this is done which you could use as starting points for understanding the required approach.
Or, depending on what the Parent document is, it might make more sense to start with a copy of the "new" document and edit it with the other content.
FWIW and mentioned here for the sake of completeness: The COM object model will do what is described - copy the body of the document and paste to another document does carry over all this information. But the Word application is doing all the "heavy lifting" that the developer needs to code when using the Open XML SDK.
I'm having an issue where PdfWriter from iText 7.0.4.0 (.NET 4.5.1) produces corrupted PDF documents for certain input PDF files.
To elaborate, PDF files with well-formed paragraphs have no issues. However, if the input PDF contains irregular contents (for a lack of better words; please refer to the samples in Google drive), PdfWriter produces corrupted PDF files; by corrupted, I mean that the file can be opened, but it shows a blank page with extremely high zoom (in Adobe Reader XI). Corrupted samples have also been provided in the aforementioned Google drive link.
Sample code:
using (var pdfReader = new PdfReader("sample1_input.pdf"))
{
PdfDocument pdfDoc = new PdfDocument(pdfReader, new PdfWriter("sample1_corrupted_output.pdf"));
// Trying to highlight a part of PDF by referencing this example:
// https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-highlighting-text
// Commented out for now because PdfWriter is producing corrupted PDF documents for the samples and similar PDF files.
//PdfCanvas canvas = new PdfCanvas(pdfDoc.GetFirstPage());
//canvas.SetExtGState(new PdfExtGState().SetFillOpacity(0.1f));
//canvas.SaveState();
//canvas.SetFillColor(Color.YELLOW);
//canvas.Rectangle(100, 100, 200, 200);
//canvas.Fill();
//canvas.RestoreState();
pdfDoc.Close(); // Corrupted PDF file is produced, even without highlighting.
}
One "interesting" thing I noticed is that if I provide "new StampingProperties().UseAppendMode()" as the third parameter of PdfDocument (without the highlighting code), PdfWriter spits out the original file (although a few kb larger than the original for some reason). However, PdfWriter goes back to producing corrupted PDFs when the highlighting code is un-commented.
Link to sample files:
https://drive.google.com/open?id=0B3NPOZswWocQV09KMW5fbFVyUm8
sample1_input.pdf (input sample #1) -> sample1_corrupted_output.pdf (corrupted output)
sample2_input.pdf (input sample #2) -> sample2_corrupted_output.pdf (corrupted output)
Please kindly give some advice.
The cause of this corruption is an unusual structure of the page tree of the PDFs in question:
It is unusual in two ways:
It has a subtree without any page objects (dictionaries 17 an 21).
It has a node with mixed child node types (dictionary 10 has a Page child 3 and a Pages child 17)
If one removes the page-less subtree (by removing object 17 from the Kids of object 10), both quirks are removed and the code does not fail anymore.
While both quirks are weird, I don't see anything in ISO 32000-1 (unfortunately I don't have a copy of ISO 32000-2 yet) indicating that these unusual structures are explicitly forbidden. Thus, I would assume this is an iText bug.
I could reproduce the problem with iText 7.0.4 for Java but not the current development SNAPSHOT of 7.0.5.
Indeed, there is a commit dated 2017-09-19 10:03:37 [c0b35f0] described as "Fix bugs in pages tree rebuilding" with differences in the PdfPagesTree class in a code block described as "handle mix of PdfPage and PdfPages". Thus, the issue appears to be known and already fixed.
You may either wait for the 7.0.5 release or look for hotfixes 7.0.4.x.
I have a request to create a word document on the fly based on a template provided to me. I have done some research and everything seems to point at OpenXML. I have looked into that, but the cs file that gets created is over 15k lines and is breaking my VS 2010 (causing it to not respond every time I make a change).
I have been looking at this tutorial series on Open XML
http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/10/13/getting-started-with-open-xml-development.aspx
I have done things in the past with text files and Regular Expressions, but since Word encrypts everything, that does not work. Are there any other options that are fairly lightweight for creating word documents from templates.
//Hi, It is quite simple.
//First, you should copy your Template file into another location.
string SourcePath = "C:\\MyTemplate.dotx";
string DestPath = "C:\\MyDocument.docx";
System.IO.File.Copy(SourcePath, DestPath);
//After copying the file, you can open a WordprocessingDocument using your Destination Path.
WordprocessingDocument Mydoc = WordprocessingDocument.Open(DestPath, true);
//After openning your document, you can change type of your document after adding additional parts into your document.
mydoc.ChangeDocumentType(WordprocessingDocumentType.Document);
//If you wish, you can edit your document
AttachedTemplate attachedTemplate1 = new AttachedTemplate() { Id = "MyRelationID" };
MainDocumentPart mainPart = mydoc.MainDocumentPart;
MySettingsPart = mainPart.DocumentSettingsPart;
MySettingsPart.Settings.Append(attachedTemplate1);
MySettingsPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(CopyPath, UriKind.Absolute), "MyRelationID");
//Finally you can save your document.
mainPart.Document.Save();
I am currently working on something along these lines and I have been making use of the Open XML SDK and the OpenXmlPowerTools The approach been taken is taking the actual template file opening it up and putting text into various place holders within the template document. I have been using content controls as the place markers.
The SDK tool to open up a document has been invaluable in being able to compare documents and see how it is constructed. However the code generated from the tool I have been refactoring heavily and removing sections that are not being used at all.
I can't talk about doc files but with docx files they are not encrypted they are just zip files that contain xml files
Eric White's blog has a large number of examples and code samples which have been very useful