I'm using iTextSharp to load an existing PDF and adding text using the PdfStamper. I want full control over the text, meaning I want to be able to control the font (only TrueType), font size and coordinates. Right now, I'm using ShowTextAligned to add text to certain coordinaties and setFontAndSize to set the font and font size. This is my code to add text:
private void AddText(BaseFont font, string text, int x, int y, int size)
{
pdf.BeginText();
pdf.SetFontAndSize(font, size);
pdf.ShowTextAligned(PdfContentByte.ALIGN_LEFT, text, x, y, 0);
pdf.EndText();
}
The following function is used to load the TrueType font:
public BaseFont GetFont(string font, string encoding)
{
if (!(font.EndsWith(".ttf") || font.EndsWith(".TTF")))
font += ".ttf";
BaseFont basefont;
basefont = BaseFont.CreateFont(ConfigurationManager.AppSettings["fontdir"] + font, encoding, BaseFont.NOT_EMBEDDED);
if (basefont == null)
throw new Exception("Could not load font '" + font + "' with encoding '" + encoding + "'");
return basefont;
}
The following code is used to load the existing PDF:
Stream outputPdfStream = Response.OutputStream;
PdfReader pdfReader = new PdfReader(new RandomAccessFileOrArray(HttpContext.Current.Request.MapPath("PdfTemplates/" + ConfigurationManager.AppSettings["pdf_template"])), null);
PdfStamper pdfStamper = new PdfStamper(pdfReader, outputPdfStream);
pdf = pdfStamper.GetOverContent(1);
This all works perfectly, except when I try to use different fonts. So when AddText is called multiple times with different fonts, the PDF will display a generic error when openend. I wonder if it is possible to use different fonts using the ShowTextAligned function and if it is, how?
Not really, no. It'll only handle one font at a time. Out of curiosity what are you doing to get bad pdf output? I'd like to see your code.
Have a look at ColumnText instead. There are quite a few examples floating around and its well-covered in "iText in Action 2nd edition". All the samples from the book are available on line.
Thanks for your answer Mark, however I already solved the issue. There was a problem with my Content-Type header I use to tell the browser how large the PDF is. This caused the browser to stop downloading before the entire PDF was actually downloaded. When adding a new font, the PDF size would just exceed the size specified in the Content-Type header, thus resulting in a bad PDF. It's solved now, multiple fonts work just fine :-).
Related
I've been attempting to find an easy solution to exporting a Canvas in my WPF Application to a PDF Document.
So far, the best solution has been to use the PrintDialog and set it up to automatically use the Microsoft Print the PDF 'printer'. The only problem I have had with this is that although the PrintDialog is skipped, there is a FileDialog to choose where the file should be saved.
Sadly, this is a deal-breaker because I would like to run this over a large number of canvases with automatically generated PDF names (well, programitically provided anyway).
Other solutions I have looked at include:
Using PrintDocument, but from my experimentation I would have to manually iterate through all my Canveses children and manually invoke the correct Draw method (of which a lot of my custom elements with transformation would be rather time consuming to do)
Exporting as a PNG image and then embedding that in a PDF. Although this works, TextBlocks within my canvas are no longer text. So this isn't an ideal situation.
Using the 3rd party library PDFSharp has the same downfall as the PrintDocument. A lot of custom logic for each element.
With PDFSharp. I did find a method fir generating the XGraphics from a Canvas but no way of then consuming that object to make a PDF Page
So does anybody know how I can skip or automate the PDF PrintDialog, or consume PDFSharp XGraphics to make
A page. Or any other ideas for directions to take this besides writing a whole library to convert each of my Canvas elements to PDF elements.
If you look at the output port of a recent windows installation of Microsoft Print To PDF
You may note it is set to PORTPROMP: and that is exactly what causes the request for a filename.
You might note lower down, I have several ports set to a filename, and the fourth one down is called "My Print to PDF"
So very last century methodology; when I print with a duplicate printer but give it a different name I can use different page ratios etc., without altering the built in standard one. The output for a file will naturally be built:-
A) Exactly in one repeatable location, that I can file monitor and rename it, based on the source calling the print sequence, such that if it is my current default printer I can right click files to print to a known \folder\file.pdf
B) The same port can be used via certain /pt (printto) command combinations to output, not just to that default port location, but to a given folder\name such as
"%ProgramFiles%\Windows NT\Accessories\WORDPAD.EXE" /pt listIN.doc "My Print to PDF" "My Print to PDF" "listOUT.pdf"
Other drivers usually charge for the convenience of WPF programmable renaming, but I will leave you that PrintVisual challenge for another of your three wishes.
MS suggest XPS is best But then they would be promoting it as a PDF competitor.
It does not need to be Doc[X]2PDF it could be [O]XPS2PDF or aPNG2PDF or many pages TIFF2PDF etc. etc. Any of those are Native to Win 10 also other 3rd party apps such as [Free]Office with a PrintTo verb will do XLS[X]2PDF. Imagination becomes pagination.
I had a great success in generating PDFs using PDFSharp in combination with SkiaSharp (for more advanced graphics).
Let me begin from the very end:
you save the PdfDocument object in the following way:
PdfDocument yourDocument = ...;
string filename = #"your\file\path\document.pdf"
yourDocument.Save(filename);
creating the PdfDocument with a page can be achieved the following way (adjust the parameters to fit your needs):
PdfDocument yourDocument = new PdfDocument();
yourDocument.PageLayout = PdfPageLayout.SinglePage;
yourDocument.Info.Title = "Your document title";
PdfPage yourPage = yourDocument.AddPage();
yourDocument.Orientation = PageOrientation.Landscape;
yourDocument.Size = PageSize.A4;
the PdfPage object's content (as an example I'm putting a string and an image) is filled in the following way:
using (XGraphics gfx = XGraphics.FromPdfPage(yourPage))
{
XFont yourFont = new XFont("Helvetica", 20, XFontStyle.Bold);
gfx.DrawString(
"Your string in the page",
yourFont,
XBrushes.Black,
new XRect(0, XUnit.FromMillimeter(10), page.Width, yourFont.GetHeight()),
XStringFormats.Center);
using (Stream s = new FileStream(#"path\to\your\image.png", FileMode.Open))
{
XImage image = XImage.FromStream(s);
var imageRect = new XRect()
{
Location = new XPoint() { X = XUnit.FromMillimeter(42), Y = XUnit.FromMillimeter(42) },
Size = new XSize() { Width = XUnit.FromMillimeter(42), Height = XUnit.FromMillimeter(42.0 * image.PixelHeight / image.PixelWidth) }
};
gfx.DrawImage(image, imageRect);
}
}
Of course, the font objects can be created as static members of your class.
And this is, in short to answer your question, how you consume the XGraphics object to create a PDF page.
Let me know if you need more assistance.
I am using iTextSharp for PDF processing, and I need to extract all text from an existing PDF that is written in a certain font.
A way to do that is to inherit from a RenderFilter and only allow text that has a certain PostscriptFontName. The problem is that when I do this, I see the following font names in the PDF:
CIDFont+F1
CIDFont+F2
CIDFont+F3
CIDFont+F4
CIDFont+F5
which is nothing like the actual font names I am looking for.
I have tried enumerating the font resources, and it shows the same result.
I have tried opening the PDF in the full Adobe Acrobat. It also shows the mangled font names:
I have tried analysing the file with iText RUPS. Same result.
That is, I have not been able to see the actual font names anywhere in the document structure.
Yet, Adobe Acrobat DC does show the correct font names in the Format pane when I select various text boxes on the document canvas (e.g. Arial, Courier New, Roboto), so that information must be stored somewhere.
How do I get those real font names when parsing PDFs with iTextSharp?
As determined in the course of the comments to the question, the font names are anonymized in all PDF metadata for the font but the embedded font program itself contains the actual font name.
(So the PDF strictly speaking is broken, even though in a way hardly any software will ever complain about.)
If we want to retrieve those names, therefore, we have to look inside these font programs.
Here a proof of concept following the architecture used in this answer you referenced, i.e. using a RenderFilter:
class FontProgramRenderFilter : RenderFilter
{
public override bool AllowText(TextRenderInfo renderInfo)
{
DocumentFont font = renderInfo.GetFont();
PdfDictionary fontDict = font.FontDictionary;
PdfName subType = fontDict.GetAsName(PdfName.SUBTYPE);
if (PdfName.TYPE0.Equals(subType))
{
PdfArray descendantFonts = fontDict.GetAsArray(PdfName.DESCENDANTFONTS);
PdfDictionary descendantFont = descendantFonts[0] as PdfDictionary;
PdfDictionary fontDescriptor = descendantFont.GetAsDict(PdfName.FONTDESCRIPTOR);
PdfStream fontStream = fontDescriptor.GetAsStream(PdfName.FONTFILE2);
byte[] fontData = PdfReader.GetStreamBytes((PRStream)fontStream);
MemoryStream dataStream = new MemoryStream(fontData);
dataStream.Position = 0;
MemoryPackage memoryPackage = new MemoryPackage();
Uri uri = memoryPackage.CreatePart(dataStream);
GlyphTypeface glyphTypeface = new GlyphTypeface(uri);
memoryPackage.DeletePart(uri);
ICollection<string> names = glyphTypeface.FamilyNames.Values;
return names.Where(name => name.Contains("Arial")).Count() > 0;
}
else
{
// analogous code for other font subtypes
return false;
}
}
}
The MemoryPackage class is from this answer which was my first find searching for how to read information from a font in memory using .Net.
Applied to your PDF file like this:
using (PdfReader pdfReader = new PdfReader(SOURCE))
{
FontProgramRenderFilter fontFilter = new FontProgramRenderFilter();
ITextExtractionStrategy strategy = new FilteredTextRenderListener(
new LocationTextExtractionStrategy(), fontFilter);
Console.WriteLine(PdfTextExtractor.GetTextFromPage(pdfReader, 1, strategy));
}
the result is
This is Arial.
Beware: This is a mere proof of concept.
On one hand you will surely also need to implement the part commented as analogous code for other font subtypes above; and even the TYPE0 part is not ready for production use as it only considers FONTFILE2 and does not handle null values gracefully.
On the other hand you will want to cache names for fonts already inspected.
I am using PdfStamper to create PDF at run time. My problem is that I am not able to increase the font size of a PDF field. I have tried this but no luck,
stamper.AcroFields.SetFieldProperty("names", "textsize", 4f, null);
Font font = FontFactory.GetFont(FontFactory.COURIER, 2f, iTextSharp.text.Font.BOLD);
stamper.AcroFields.AddSubstitutionFont(font.BaseFont);
I got it working using
stamper.AcroFields.SetFieldProperty("names", "textsize", 4f, null);
but it has to be set before the field is filled
Which version of iTextSharp are you using? I have 5.0.6.0 and the following line of code works for me:
stamper.AcroFields.SetFieldProperty("SomeDateField", "textsize", 8f, null);
However, I encountered an oddity... the above line only works for me if that field's font size is set to Auto. When it is set to a fixed font-size, I can't seem to change it through code (I tried several different ways that I had come across).
I'd be curious if you experience the same if you set that field to Auto font-size in Acrobat.
From what I've been able to ascertain, it looks like the font size is completely relative to the horizontal and vertical width of the text field. I have played around with a few processes to try to "re-size" the text at run-time, but none have yielded any results. The only "false-positive" I was able to produce was when I re-sized the text field manually. Sorry this wasn't more helpful to solving your problem, I just figured I would share my experience with this same problem. I'll keep an eye out for any solutions though, and if you manage to come up with a solution for this, please post it, because it would be very valuable knowledge.
To change font size for every form field you could do it like this:
using (PdfReader pdfReader = new PdfReader(fileInfo.FullName))
{
using (var ms = new MemoryStream())
{
using (var pdfStamper = new PdfStamper(pdfReader, ms))
{
SetAcroFields(pdfStamper, myModel);
// flatten the form to remove editting options, set it to false
// to leave the form open to subsequent manual edits
pdfStamper.FormFlattening = true;
var pdfFormFields = pdfStamper.AcroFields;
foreach (var f in pdfReader.AcroFields.Fields)
{
//Change font size here if auto should not be used
pdfFormFields.SetFieldProperty(f.Key.ToString(), "textsize", (float)8.0, null);
}
}
return ms.ToArray();
}
}
I have a complex application producing PDFs via PDFSharp. I'm running into a problem which is proving very difficult to solve.
When rendering images (text is an image as well) rotated, the PDF produced looks fine, but when printed it has jagged edges and generally messed up -- see attachment.
Here is the relevant code:
// determine how big the image should be
double destinationWidth = Math.Round(pageWidth * imageInfo.WidthFactor);
double destinationHeight = destinationWidth;
// rescale the image to needed size
imageInfo.Image = ImageHelper.ResizeImage(imageInfo.Image, (int)(destinationWidth * 3), (int)(destinationHeight * 3));
// get image
XImage xImage = XImage.FromGdiPlusImage(imageInfo.Image);
// define fill area
XRect destination = new XRect();
destination.X = imageInfo.XFactor * pageWidth;
destination.Y = imageInfo.YFactor * pageHeight;
destination.Width = destinationWidth; //pageWidth * imageInfo.WidthFactor;
destination.Height = destinationHeight; //destination.Width; // shouldn't this use the page height and height factor?
// save state before rotate
XGraphicsState previousState = gfx.Save();
// rotate canvas
gfx.RotateAtTransform(imageInfo.RotationAngle, new XPoint(destination.X + destination.Width / 2, destination.Y + destination.Height / 2));
// render image
gfx.DrawImage(xImage, destination);
// undo transforms
gfx.Restore(previousState);
Please, please, help. It prints fine from Chrome's PDF viewer, for what it's worth.
I attempted converting the images to SVG (pixel by pixel) and rendering, which worked fine, but performance made it not feasible. I need to find a more elegant solution.
Thanks so much!
PDF:
https://dl.dropbox.com/u/49564994/PDF.pdf
Print-out:
https://dl.dropbox.com/u/49564994/Print.jpg
Almost two years ago I had a similar problem. A generated PDF was all garbled up when I printed it. It was just a report, did not contain any images, but several sentences or words were missing.
I used a Word template, replaced some placeholders to generate a report and then saved the Word document to PDF using the Office Save As PDF add-in.
There is a difference when you print the PDF with a PCL printer driver or a PostScript one. Check out if you get any difference between those. Might be a font problem. Check that the font encoding is set correctly.
At the time I did not find a solution. Finally resorted to converting the PDF to an image and sending that to the printer. Worked fine.
This should also be possible using PDFSharp by invoking GhostScript to create images from PDF pages.
I'm not sure that this is possible but I figured it would be worth asking. I have figured out how to set the font of a formfield using the pdfstamper and acrofields methods but I would really like to be able to set the font of different parts of the text in the same field. Here's how I'm setting the font of the form fields currently:
// Use iTextSharp PDF Reader, to get the fields and send to the
//Stamper to set the fields in the document
PdfReader pdfReader = new PdfReader(fileName);
// Initialize Stamper (ms is a MemoryStream object)
PdfStamper pdfStamper = new PdfStamper(pdfReader, ms);
// Get Reference to PDF Document Fields
AcroFields pdfFormFields = pdfStamper.AcroFields;
//create a bold font
iTextSharp.text.Font bold = FontFactory.GetFont(FontFactory.COURIER, 8f, iTextSharp.text.Font.BOLD);
//set the field to bold
pdfFormFields.SetFieldProperty(nameOfField, "textfont", bold.BaseFont, null);
//set the text of the form field
pdfFormFields.SetField(nameOfField, "This: Will Be Displayed In The Field");
// Set the flattening flag to false, so the document can continue to be edited
pdfStamper.FormFlattening = true;
// close the pdf stamper
pdfStamper.Close();
What I'd like to be able to do where I set the text above is set the "This: " to bold and leave the "Will Be Displayed In The Field" non-bolded. I'm not sure this is actually possible but I figured it was worth asking because it would really be helpful in what I'm currently working on.
Thanks in advance!
Yes, kinda. PDF fields can have a rich text value (since acrobat 6/pdf1.5) along with a regular value.
The regular value uses the font defined in the default appearances... a single font.
The rich value format (which iText doesn't support directly, at least not yet), is described in chapter 12.7.3.4 of the PDF Reference. <b>, <i>, <p>, and quite a few css2 text attributes. It requires a with various attributes.
To enable rich values, you have to set bit 26 of the field flags (PdfName.FF) for a text field. PdfFormField doesn't have a "setRichValue", but they're dictionaries, so you can just:
myPdfFormField.put(PdfName.RV, new PdfString( richTextValue ) );
If you're trying to add rich text to an existing field that doesn't already support it:
AcroFields fields = stamper.getAcroFields();
AcroFields.Item fldItem = fields.getFieldItem(fldName);
PdfDictionary mergedDict = item.getMerged(0);
int flagVal = mergedDict.getAsNumber(PdfName.FF).intValue();
flagVal |= (1 << 26);
int writeFlags = AcroFields.Item.WRITE_MERGED | AcroFields.Item.WRITE_VALUE;
fldItem.writeToAll(PdfName.FF, new PdfNumber(flagVal), writeFlags);
fldItem.writeToAll(PdfName.RV, new PdfString(richTextValue), writeFlags);
I'm actually adding rich text support to iText (not sharp) as I type this message. Hurray for contributors on SO. Paulo's been good about keeping iTextSharp in synch lately, so that shouldn't be an issue. The next trunk release should have this feature... so you'd be able to write:
myPdfFormField.setFieldFlags( PdfFormField.FF_RICHTEXT );
myPdfFormField.setRichValue( richTextValue );
or
// note that this will fail unless the rich flag is set
acroFields.setFieldRichValue( richTextValue );
NOTE: iText's appearance generation hasn't been updated, just the value side of things. That would take considerably more work. So you'll want to acroFields.setGenerateAppearances(false) or have JS that resets the field value when the form its opened to force Acrobat/Reader to build the appearance[s] for you.
It took me some time to figure out after richtextfield did not work the way it was suppose too with
acrofields and most of the cases were the pdf was not editable with different fonts at runtime.
I worked out way of setting different fonts in acrofields and passing values and editing at runtime with itextsharp and I thought it will be useful for others.
Create pdf with a text field in PDF1.pdf (I hope you know how to create field in pdf)
e.g., txtComments
Go to the property section and Set the property to richtext,Mulitiline
Format the text content in word or pdf by adding the fonts and colors.
If done in word, copy and paste the content in pdf - txtcomments field.
Note:
If you want to add dynamic content. Set the parameter “{0}” to the txtComment field in the pdf.
using string format method you can set values to it.This is shown in the code below.
e.g., "Set different parts of a form field to have different fonts using {0}"
Add the following code in a button (this is not specific) event by reference in the itextsharp.dll 5.4.2
Response.ContentType = "application/pdf";
Response.AddHeader("Content-disposition","attachment; filename=your.pdf");
PdfReader reader = new PdfReader(#"C:\test\pdf1.pdf");
PdfStamper stamp = new PdfStamper(reader, Response.OutputStream);
AcroFields field = pdfStamp.AcroFields;
string comments = field .GetFieldRichValue("txtcomments");
string Name = "Test1";
string value = string.Format(comments,Name);
field.SetField("txtComment", value );
field.GenerateAppearances = false;//Pdf knows what to do;
stamp.FormFlattening = false;//available for edit at run time
stamp.FreeTextFlattening = true;
stamp.Close();
reader.Close()
Hope this helps.