Itextsharp Generate arabic font - c#

I have a solution that contains ASP WEB API projects
i have a methode that use ItextSharp to create a PDF DOCUMENT
this document contains french and arabic text
I use this code to get arabic font:
public static BaseFont GetArabicFont()
{
var appDomain = System.AppDomain.CurrentDomain;
var basePath = appDomain.BaseDirectory;
var fontPath = Path.Combine(basePath, "fonts", "pdf", "ARIALUNI.TTF");
try
{
BaseFont bf = BaseFont.CreateFont(fontPath, BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
return bf;
}
catch (Exception ex)
{
return BaseFont.CreateFont(BaseFont.TIMES_ROMAN, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
}
}
when I execute the application I can generate the pdf document correctly (including french and arabic text)
after deploying web api project in IIS 7 when i call the method that generate a pdf, but no response
when I use Postmane to call the api directly I see this message:
"Message": "An error has occurred.",
"ExceptionMessage": Identity-H is not a supported encoding name"
I do not know if the problem is IN IIS or I must change IDENTITY_H font ?
can someone help?

This doesn't make sense:
BaseFont.CreateFont(fontPath, BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
If you use BaseFont.IDENTITY_H, iText will always embed a font. If it didn't, iText would create PDFs that aren't compliant with ISO-32000-1. It's more correct to to this:
BaseFont.CreateFont(fontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
This isn't really important, as iText will ignore your error and embed the font anyway, even if you tell iText not to embed it. That is why your code works correctly on your machine.
You say that the same code doesn't work on IIS. I am assuming that the fontPath to ARIALUNI.TTF doesn't result in a font on IIS. Maybe the font is missing; maybe IIS doesn't have access to that font. In that case, an error is thrown and the following line is encountered:
return BaseFont.CreateFont(BaseFont.TIMES_ROMAN, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
This line doesn't make sense either, as BaseFont.IDENTITY_H can't be used in combination with BaseFont.TIMES_ROMAN. Moreover: Times-Roman doesn't contain any Arabic glyphs, abd you can't embed Times-Roman unless you provide the PFB file along with the AFM file.
The solution to your problem is to make sure that ArialUni.ttf is present on your server, or that you provide another font that supports Arabic.

Related

How do I extract actual font names from a PDF with iTextSharp?

I am using iTextSharp for PDF processing, and I need to extract all text from an existing PDF that is written in a certain font.
A way to do that is to inherit from a RenderFilter and only allow text that has a certain PostscriptFontName. The problem is that when I do this, I see the following font names in the PDF:
CIDFont+F1
CIDFont+F2
CIDFont+F3
CIDFont+F4
CIDFont+F5
which is nothing like the actual font names I am looking for.
I have tried enumerating the font resources, and it shows the same result.
I have tried opening the PDF in the full Adobe Acrobat. It also shows the mangled font names:
I have tried analysing the file with iText RUPS. Same result.
That is, I have not been able to see the actual font names anywhere in the document structure.
Yet, Adobe Acrobat DC does show the correct font names in the Format pane when I select various text boxes on the document canvas (e.g. Arial, Courier New, Roboto), so that information must be stored somewhere.
How do I get those real font names when parsing PDFs with iTextSharp?
As determined in the course of the comments to the question, the font names are anonymized in all PDF metadata for the font but the embedded font program itself contains the actual font name.
(So the PDF strictly speaking is broken, even though in a way hardly any software will ever complain about.)
If we want to retrieve those names, therefore, we have to look inside these font programs.
Here a proof of concept following the architecture used in this answer you referenced, i.e. using a RenderFilter:
class FontProgramRenderFilter : RenderFilter
{
public override bool AllowText(TextRenderInfo renderInfo)
{
DocumentFont font = renderInfo.GetFont();
PdfDictionary fontDict = font.FontDictionary;
PdfName subType = fontDict.GetAsName(PdfName.SUBTYPE);
if (PdfName.TYPE0.Equals(subType))
{
PdfArray descendantFonts = fontDict.GetAsArray(PdfName.DESCENDANTFONTS);
PdfDictionary descendantFont = descendantFonts[0] as PdfDictionary;
PdfDictionary fontDescriptor = descendantFont.GetAsDict(PdfName.FONTDESCRIPTOR);
PdfStream fontStream = fontDescriptor.GetAsStream(PdfName.FONTFILE2);
byte[] fontData = PdfReader.GetStreamBytes((PRStream)fontStream);
MemoryStream dataStream = new MemoryStream(fontData);
dataStream.Position = 0;
MemoryPackage memoryPackage = new MemoryPackage();
Uri uri = memoryPackage.CreatePart(dataStream);
GlyphTypeface glyphTypeface = new GlyphTypeface(uri);
memoryPackage.DeletePart(uri);
ICollection<string> names = glyphTypeface.FamilyNames.Values;
return names.Where(name => name.Contains("Arial")).Count() > 0;
}
else
{
// analogous code for other font subtypes
return false;
}
}
}
The MemoryPackage class is from this answer which was my first find searching for how to read information from a font in memory using .Net.
Applied to your PDF file like this:
using (PdfReader pdfReader = new PdfReader(SOURCE))
{
FontProgramRenderFilter fontFilter = new FontProgramRenderFilter();
ITextExtractionStrategy strategy = new FilteredTextRenderListener(
new LocationTextExtractionStrategy(), fontFilter);
Console.WriteLine(PdfTextExtractor.GetTextFromPage(pdfReader, 1, strategy));
}
the result is
This is Arial.
Beware: This is a mere proof of concept.
On one hand you will surely also need to implement the part commented as analogous code for other font subtypes above; and even the TYPE0 part is not ready for production use as it only considers FONTFILE2 and does not handle null values gracefully.
On the other hand you will want to cache names for fonts already inspected.

MigraDox C# Checkboxes - Wingdings Not Working

I am needing to simulate a checkbox in a PDF I am generating using the MigraDoc library. I stumbled across two sources that offer essentially the same solution (here and here)
However, I am not getting the expected results. Instead I am getting þ for boxes that are supposed to be checked, and ¨ for those that are to be unchecked. What might the issue be?
Snippet of my code
para = section.AddParagraph();
para.Style = "ListLevelOne";
para.AddFormattedText("1 ", "Bold");
para.AddFormattedText(IsQ1Checked ? "\u00fe" : "\u00A8", new Font("Wingdings"));
MigraDoc does not use the font "Wingdings", instead it uses a default font (could be MS Sans or so) and therefore you see the characters from a standard font, not the Wingdings symbol.
The problem is somewhere outside the code snippet you are showing here. Make sure the font Wingdings is installed on the computer.
You may want to embed the font and ensure that MigraDoc uses unicode encoding instead of ansi:
private const bool unicode = true;
private const PdfFontEmbedding embedding = PdfFontEmbedding.Always;
//...
var pdfRenderer = new PdfDocumentRenderer(unicode, embedding);
http://www.pdfsharp.net/wiki/migradochelloworld-sample.ashx

Fonts don't load in the pdf viewer

I have currently a problem with PdfSharp/MigraDoc and a pdf viewer. I have used the EZFontResolver made by Thomas to be able to generate pdfs with custom fonts. Unfortunately the pdf viewer is unable to render the font, and I have no idea why. I have seen a bug described by Travis on Thomas' blog, which noted, that if EZFontResolver doesn't have multiple bold/italic symbol recognition (for example "fontname|b|b"), than PdfDocumentRenderer.RenderDocument() fails. The point is, when I try something like this:
Document document = DdlReader.DocumentFromString(ddl);
_renderer = new DocumentRenderer(document);
_renderer.PrepareDocument();
than the EZFontResolver is being asked for fonts with names like "customfont|b|b" (it doesn't happen when I use only PdfDocument.Save(...)) instead of "customfont".
My pdf viewer overrides DocumentViewer and views FixedDocument class instances. The funny thing is that the saved pdf file has all the fonts set, but the preview is unable to do that (and that is my big problem). All of this happens even though I return the right font with the resolver.
EDIT:
The ddl is a string which looks something like this:
"\\document
[
Info
{
Title = \"My file\"
Subject = \"My pdf file\"
Author = \"mikes\"
}
]
{
\\styles
{
Heading1 : Normal
{
Font
{
Name = \"My custom font\"
Bold = true
}
ParagraphFormat
{
Alignment = Center
SpaceBefore = \"0.5cm\"
SpaceAfter = \"0.5cm\"
}
}
header : Normal
{
Font
{
Name = \"My custom font\"
Size = 6
}
ParagraphFormat
{
Alignment = Center
}
}
And when I deleted the bug fix by Travis, the exception was thrown in the _renderer.PrepareDocument() (after fix, the stack trace showed that the source of multiple "|b" was also out of there).
Simulated bold and simulated italics use the regular font, but a transformation is applied.
Therefore the simulation will not work if the PDF viewer does not support those transformations.
The DocumentViewer that comes with MigraDoc does not display PDF files, it displays MigraDoc documents. For technical reasons it cannot use fonts supplied via the IFontResolver interface. EZFontResolver is an implementation of IFontResolver.
With respect to "customfont|b|b": I cannot say whether this is a bug or the regular behaviour. Please provide an MCVE (complete sample) if you think it is a bug.

Saving PDF with cyrillic text in Unity3D using iTextSharp

I create a BaseFont using that code
string combineStr = Path.Combine(Application.dataPath+"/Resources/Fonts", "Calibri Regular.ttf");
BaseFont bf = BaseFont.CreateFont(combineStr, BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
iTextSharp.text.Font titleFont = new iTextSharp.text.Font(bf, 10f, iTextSharp.text.Font.NORMAL, BaseColor.BLACK);
Testing it in Unity everything works fine, and my cyrillic glyphs working well, but in WindowsPalyer build i get an error
ArgumentException: Encoding name 'windows-1252' not supported
Paremeter name: name.
Cant find any problem here.. Checked everything, and how I see my BaseFont code is correct. What is wrong in my case?
P.S. I have tried also others fonts with cyrillic support, but nothing helps.
The OP added the solution to the question. I'm removing it from the question, adding it as a real answer:
The issue here is that I18N.dll and I18N.West.dll are missing in the standalone player.They are available in the editor, though. That's why it's working in the editor but not in the standalone player.
Solution: Put those DLLs into your project (probably best next to System.Data.dll), that way, they will be also available in the standalone player.
See also CodePage 1252 not supported - works in editor but not in standalone player on Unity Answers.

Using different fonts in PDF using iTextSharp and PDFStamper

I'm using iTextSharp to load an existing PDF and adding text using the PdfStamper. I want full control over the text, meaning I want to be able to control the font (only TrueType), font size and coordinates. Right now, I'm using ShowTextAligned to add text to certain coordinaties and setFontAndSize to set the font and font size. This is my code to add text:
private void AddText(BaseFont font, string text, int x, int y, int size)
{
pdf.BeginText();
pdf.SetFontAndSize(font, size);
pdf.ShowTextAligned(PdfContentByte.ALIGN_LEFT, text, x, y, 0);
pdf.EndText();
}
The following function is used to load the TrueType font:
public BaseFont GetFont(string font, string encoding)
{
if (!(font.EndsWith(".ttf") || font.EndsWith(".TTF")))
font += ".ttf";
BaseFont basefont;
basefont = BaseFont.CreateFont(ConfigurationManager.AppSettings["fontdir"] + font, encoding, BaseFont.NOT_EMBEDDED);
if (basefont == null)
throw new Exception("Could not load font '" + font + "' with encoding '" + encoding + "'");
return basefont;
}
The following code is used to load the existing PDF:
Stream outputPdfStream = Response.OutputStream;
PdfReader pdfReader = new PdfReader(new RandomAccessFileOrArray(HttpContext.Current.Request.MapPath("PdfTemplates/" + ConfigurationManager.AppSettings["pdf_template"])), null);
PdfStamper pdfStamper = new PdfStamper(pdfReader, outputPdfStream);
pdf = pdfStamper.GetOverContent(1);
This all works perfectly, except when I try to use different fonts. So when AddText is called multiple times with different fonts, the PDF will display a generic error when openend. I wonder if it is possible to use different fonts using the ShowTextAligned function and if it is, how?
Not really, no. It'll only handle one font at a time. Out of curiosity what are you doing to get bad pdf output? I'd like to see your code.
Have a look at ColumnText instead. There are quite a few examples floating around and its well-covered in "iText in Action 2nd edition". All the samples from the book are available on line.
Thanks for your answer Mark, however I already solved the issue. There was a problem with my Content-Type header I use to tell the browser how large the PDF is. This caused the browser to stop downloading before the entire PDF was actually downloaded. When adding a new font, the PDF size would just exceed the size specified in the Content-Type header, thus resulting in a bad PDF. It's solved now, multiple fonts work just fine :-).

Categories

Resources