GdiCharSet & iTextSharp

GdiCharSet & iTextSharp - c#

Is there the way to create iTextSharp font using additional info (such as gdiCharSet etc) from System.Drawing.Font object?

Short answer: Yes.
Long answer: Each attribute is a bit different, but all that information can be expressed within PDF in general, and iText[Sharp] in specific.
You can specify a font's encoding when you create it, but you must do so in a way iText understands. Specifically, encoding values are strings within iText[Sharp]. BaseFont has a number of public static string members that list many of the available encodings, including several code pages that will map nicely to some of the GdiCharSet values. Others, not so much.
I generally suggest using "Identity-H" and subsetting your fonts (which happens automagically with Identity-H, you can't avoid it, which is a Good Thing) unless you need to keep the file size to a bare minimum. There are several single-byte encodings, the most common of which is "WinAnsiEncoding", BaseFont.WINANSI (the default IIRC). The string can also be the name of a "CMap" (such as Identity-H).
CMaps are generally language specific, and encoding specific. UTF & Japanese, or Big5 (a Chinese encoding as I recall), or what have you. Identity-H (and Identity-V) are font specific instead. They simply map values in the content stream to glyph indexes in the font (which can vary wildly from one font to the next, or between versions of a given font: that's why you're required to embed subsets of Identity-* fonts).
In PDF (and therefore iText[Sharp]), "bold" and "italic" are part of the font's identity, not a property. "Arial-Bold", "Arial-Italic", etc.
Strikeout and underline are decorations added after the fact (though I believe iText will let you set a flag at the font level for such things in com.itextpdf.text.Font's constructor).
iText won't give you direct access to the height, though a font's "descriptor" will let you define it.
The point size isn't a property either, you set it and the font (and color, default black) before you draw some text.

Related

Formatting text with padding does not line up in C#

I am fairly new to programming and I just wrote a simple application in C# .NET to retrieve information about system drive space. The program functions fine but I'm struggling with formatting the output.
See output:
I'm trying to use padding to get the text to line up in sort of a column format within a rich text box but the output doesn't line up because if there are multiple drives, the drive names are different lengths which throws off the padding. Even if the drive letter comes back one as M: and the other as I: the difference in the size of the letter is enough to throw off the alignment while padding.
I am wondering if there is a way to force each string value to a specific length so the padding is applied evenly or if maybe there's an even better way to format my output. Thank you in advance for your time and let me know if any further information would be helpful!

Note: One of the comments asked an important question, regarding whether the question refers to the System.Windows.Forms.RichTextBox (WinForms) or the System.Windows.Controls.RichTextBox (WPF) control. This answer applies only to the WinForms version of RichTextBox, so if you're using WPF, this doesn't apply.
The most important thing, and this was mentioned in the comments, is that you'll need to use a Monospaced font.
Since you stated you're using a RichTextBox, you'll need to know how to set it to use whatever monospaced font you've chosen.
To do that, you can use the RichTextBox.SelectionFont property.
For more general instructions, refer to this MSDN article: Setting Font Attributes for the Windows Forms RichTextBox Control
Once you set the RichTextBox.SelectionFont property, only text added to the control afterwards will use the specified font. To apply the font to existing text (i.e. you populate the RichTextBox and then change the font to an appropriate monospaced font), take a look at this answer, which tells you precisely what to do.
Once that's done, there remains the simple matter of adding the appropriate amount of whitespace to the end of each string, such that the next piece of data appears at the appropriate position. You'll probably be using String.PadRight, but for more general information about padding strings, check out this MSDN article: Padding Strings in the .NET Framework

Here is string formatting example:
string varOne = "Line One";
double varTwo = 15/100;
string output= String.Format("{0,-10} {1,5:P1}", varOne, varTwo);
//expected output is
//Line One 15 %
where formatting properties in curly brackets are:
{index[,alignment][ :formatString] }

unicode conversion in tamil from Latha font to Kal fonts

i'm using latha font for tamil typing ,I need to change latha font to Kal (Another font family). But it doesn't work for me,also i'm trying to convert latha to unicode to kal.
Need some code in c#

If you need to convert text only use < http://software.nhm.in/products/converter >. It handles many encodings and it is free.
Some manufacturers provide Unicode version of their font also in addition to the original encoding.
If you need to type in Latha font using a different Keyboard interface then get the free layout driver from < http://www.thamizha.com/project/ekalappai >
My own < http://www.mediafire.com/download/jjaqdbuk99062ki/Tamil_Driver_1.03.zip > You can type in Inscript Keyboard mode
Give more details about Kal fonts.

Why new FontFamily("Invalid font") doesn't throw an Exception?

Why the following code does not throw an exception?
FontFamily font = new FontFamily("bla bla bla");
I need to know if a specific font (as combination of FontFamily, FontStyle, FontWeight, ...) exists in my current OS. How have I to do?

This is by design. Programs frequently ask for fonts that are not present on the machine, especially in a country far flung from the programmer's domicile. The font mapper produces an alternative. Font substitution is in general very common. You are looking at Arial right now if you are on a Windows machine. But I can paste 你好世界 into this post and you'll see it render accurately, even though Arial doesn't have glyphs for Chinese characters.
So hint number one is to not actually worry about what fonts are available. The Windows api has EnumFontFamiliesEx() to enumerate available font families. But that's not exposed in WPF, some friction with OpenType there, a font standard that's rather poorly integrated with Windows. Another shadow cast when Adobe gets involved with anything Microsoft does, it seems.
Some confusion in the comments about Winforms' FontFamily class. Which is actually usable in this case, its GetFamilies() method returns an array of available families. But only TrueType, not OpenType fonts.

You can use the class System.Drawing.Text.InstalledFontCollection
http://msdn.microsoft.com/en-us/library/system.drawing.text.installedfontcollection.aspx
WPF have a framework specific method Fonts.SystemFontFamilies
http://msdn.microsoft.com/en-us/library/system.windows.media.fonts.systemfontfamilies.aspx

To answer the question of why it isn't throwing an exception, according to FontFamily Constructor on MSDN the exception wasn't added until framework version 3.5.
I suspect that you are targeting version 3.0 or below.
Cheers!

You can browse the available fonts on the System using the Fonts.SystemFontFamilies collection - use some Linq to match on whatever conditions you need;
// true
bool exists = (from f in Fonts.SystemFontFamilies where f.Source.Equals("Arial") select f).Any();
// false
exists = (from f in Fonts.SystemFontFamilies where f.Source.Equals("blahblah") select f).Any();

Get text position in Microsoft Word from VBA or C# Interop

I want to access the position and size for each indivisible unit in Microsoft Word. Examples of such units include individual characters, images, etc.
The purpose is to apply a visual overlay based on unit position and size. I will have no knowledge of the content in target documents.
Imagine the text of this question in a word document. I need to be able to iterate each character including white-space and carriage returns and get the size and position.
EDIT
It doesn't matter whether your answer considers macros, interop, add-ins or OLE embedding.

The method which retrieves displayed coordinates of an object is Window.GetPoint (link for the office interop version, same thing in VBA).
As for the "indivisible unit," you can put any meaning you want into that, using the available collections.
For instance, if you want it to be characters, you can use Document.Range.Characters, which is a collection of characters, each of which is a Range.
Or you can use Document.Range.InlineShapes for the pictures that are part of text.
Or Document.Range.ShapeRange to enumerate "floating" shapes.
At which point you might be thinking about Window.RangeFromPoint to figure an object from its window coordinates.

Font unicode glyph mapping to actual characters

I'm trying to display all glyphs in a font. I'm using GetFontUnicodeRanges to get the available characters, then I create a bitmap with all the available characters and their index next to each one.
I used the font "Wingdings 2" as a test case, and compared it to what I see in Windows' charmap.exe. I see that while all the characters appear, some characters appear more than once (total of 480 glyphs in that non-unicode font), and the positions are not the same as in charmap (for instance, medium sized circle glyph, in charmap located as 0x97, and in the font it is glyph 0xF097 and I also think it is the one in 0x2014).
I want to use the font as the "regular" way, meaning, I want to see the same data as in charmap.exe (and in a side note I would also like to know if a font is a unicode font or ascii font, as charmap shows). Basically, you can say I am trying to write my own charmap from scratch.
How can I fill in that missing data? I was looking through the Windows' fonts and text APIs, but couldn't find anything to help me, so I must be missing some relevant APIs. What are they?

After struggling a lot with GetFontData and the lack of documentation (well, not exactly lack of, but it is really not well organized, and some data is indeed missing), I found a way writing my own CharMap. Here's what I've found during development:
The documentation will tell you to use a "trick" possible since the glyph location data comes right after the arrays in cmap table. It doesn't mean it is IN the cmap table. Actually, they are in the loca table.
You would also need to read the head table for the location format flag (offset 34), and the maxp table for the number of glyphs field (offset 4).
It seems that in symbol fonts (you can tell if a font is a symbol font if the cmap header encoding id is 0, at least in TTF format 4, which is the Microsoft format) the characters are added 0xF000 to their actual index, so instead of the regular ASCII codes, you get a Unicode value in the far end of the Unicode table. I subtracted 0xF000 from each character code and tested on Wingdings[2,3] and Webdings fonts and it worked just fine.
I used the official documentation a lot: www.microsoft.com/typography/tt/ttf_spec/ttch02.doc, and the reference code: http://support.microsoft.com/kb/241020.
The reference code is written in C, so in order to write it in C# I read all the data to byte[] buffers, and "manually" read each element from it.

I went through this nightmare years ago too and I know a lot about all this stuff now. I figured I should pitch in and provide some answers.
1) You can not assume that 'loca' is following the 'cmap'. The order can vary by font. The location of each block is defined by the OffsetTable which begins generally at byte 0 of the font file. (http://www.microsoft.com/typography/otspec/otff.htm)
2) You can not assume that "cmap header encoding id is 0, at least in TTF format 4" means symbol fonts. I know for a fact that certain old arabic fonts also use that encoding. To this date, I still do not know how to differentiate them. Windows does it but I do not know how. I do not know how to know for sure that a font is a symbol font. Even checking the OS/2 table for the code page bit 32 isn't enough in many case.
3) You can not simply use the magic 0xF000 number and add it to your small 0-255 number to get the character that will give you the glyph mapping you are going for. That is because those small 0 to 255 "ASCII" code will vary depending on your system locale.
Symbol font are specials in the way that windows processes them.
Unlike normal font where the mapping between glyphs and character is static, symbol fonts mapping varies based on the system default code page for non-unicode application aka CP_ACP.
For example, Pretend your symbol font have this glyph : '%'. If your system is using CP 1252 by default, then to render this glyph you, for example, have to render the character value '0xC2'.
If your system is using CP 1251 by default, then to render this glyph you, for example, have to render the character value '0x416' which is entirely different.
Said otherwise, the font's unicode ranges varies based on the default non-unicode code page!
After investigation, we discovered that the valid character value for fonts are the values obtained by converting 0 through 255 are if they were CP_ACP value to unicode.
What does this mean? This means that you want to use MultiByteToWideChar with CP_ACP to get the mapping from values 0 to 255 to their localized unicode value based on your system locale (CP_ACP).
So, doing that will give you a map like :
ASCII -> localized non-static UNICODE
0x00 -> 0x00
0x01 -> 0x01
0x02 -> 0x02
...
0xC2 -> 0x416 <----- This is correct : the value will be different in some cases.
...
0xE3 -> 0xE3
The 0xF000 to 0xF0FF values are the static UNICODE values : they never change.
So to get the glyph ID for a "localized non-static UNICODE", you first use your map above to find the corresponding ASCII value and then you add 0xF000 to that and then you get the glyph id for that.
Of course, non of this non-sense is documented by MS... or I could never find it.

I've never looked at "WingDings 2" in detail, but it's very common for glyphs to be reused for different characters. For example, uppercase Roman A and uppercase Greek alpha are frequently the same glyph.
However, I guess the equality of 0x97, 0xF097 and 0x2014 is some kind of hack to deal with windows-1252. In the windows-1252 codepage, 0x97 is an em-dash, which is 0x2014 in Unicode. 0xF097 is in the private use area; I guess it is providing a Unicode-compatible (and reversible) way of encoding the windows-1252 0x97.
In my experience, the most reliable way to get an unambiguous list of the unicode characters supported by a font is to parse the cmap table from the ttf file. This is a bit of a chore (cmap supports something like six different encodings) but it is documented online. You can use the GetFontData function to get the raw data, or parse the ttf directly.
charmap uses the GetFontData function and the code includes the string "cmap", suggesting that charmap is also doing this.
The Windows SDK Debugging Tools include logger.exe, which records all the APIs used by an app. You can use this if you want to be really sure what charmap is doing.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

GdiCharSet & iTextSharp - c#

Is there the way to create iTextSharp font using additional info (such as gdiCharSet etc) from System.Drawing.Font object?

Related

Formatting text with padding does not line up in C#

unicode conversion in tamil from Latha font to Kal fonts

Why new FontFamily("Invalid font") doesn't throw an Exception?

Get text position in Microsoft Word from VBA or C# Interop

Font unicode glyph mapping to actual characters

Categories

Resources