iTextSharp Read Text from Layer

iTextSharp Read Text from Layer - c#

I'm using iTextSharp to interact with multi-layered PDF files from C# (VS2012). the PDF file is kind of a map, with plots on one layer and text on another.
what I want to do is, give the PDF path and layer name (the layer which has text) to application, it will search for text on the layer and create external hyperlinks on text. (I have a Dictionary with words and links). I'm able to open the PDF, read how many pages it has, and also find the required layer. But I'm kind of stuck and unable to find how to do next things i.e. read text on found layer and create hyperlinks.
Please guide me.
Thanks.

Related

How to plot strings to corresponding coordinates for printing?

My company gave me a project to automate the distribution of OR (Official Receipt).
My task is to create an application where the user will encode the OR information and the application will then print the OR (via printer). My problem is, the paper they'll print it on already has the layout. All I need to do is embed the encoded values from my application into that layout with the correct coordinates.
How can I achieve this using .NET Framework?
I already tried searching for Graphic.DrawString, but my major problem is plotting the correct coordinates for each value.
Thank You.

Is the layout the users have the same for all users or can it change?
i.e.
If "Receipt#" is at location 20,20 on one clients receipt will it be at the same location on the rest?
If the same location - use trial and error as Tomek suggests, if it is not then you will need to have a program to scan the receipt to an image and then parse the image for the words you are after.
Are the data headings the same for all users or can that change too?
i.e.
Do all clients have Receipt# on their receipts or can another client have ReceptNo instead?
If this is the case you not only need to parse for the easy headings but you now need to parse for the text and understand it via mappings.
Unless you are really good with image manipulation/parsing you will need to load the receipts into an OCR tool, or better yet convert into a PDF. The PDF will hold the names of the text as well as the locations of the text as meta data and can be parsed into your mapping file.

C#: Cover Images are not viewed in ePub document

I am creating an application to generate books as ePub. I am allowing the user to upload front and back cover. It is placed in the same folder of the HTML documents.
I am using ePubSharp library file to convert to eBook. But the Cover Pages are not displayed in the ePub document. I need the user to define the front cover and back cover of the document. I don't know what is the issue. Can anyone have any solution?

layering PDFs together and have ability to set the position

I'm being tasked to enhance the way we create custom brochures, the old way we had a legacy system create the needed pdfs and then I would download those and "glue" them into one big pdf.
the new one way we want to go about this is to skip the legacy system and build all of these things from our new system.
The biggest hurdle is the cover, which consists of the background layer, and then the logo layer which has the company logo, a shadowbox and an emblem. all of these objects are pdf documents.
my problem is after I build the logo portion, how will I be able to position the pdf exactly where I need it on the background layer?
this is all being done on the fly so I can't save anything to disk.
any help will be greatly appreciated.

There are several PDFsharp samples that show how to do it:
http://pdfsharp.net/wiki/XForms-sample.ashx
http://pdfsharp.net/wiki/Graphics-sample.ashx#Draw_a_form_XObject_a_page_from_an_external_PDF_file_27
http://pdfsharp.net/wiki/TwoPagesOnOne-sample.ashx
You can draw pages from other PDF files like images on a newly created PDF page. You can specify the exact positions and sizes, you can even transform them (skew them, rotate them).

itext create pdf based on existing one with changed content

I got quite complicated ready pdf file. It has got barcode and fancy looking table.
I have to create based on it application which will generate pdfs that will look the same but contain different records in the table and different barcode.
Is it possible to copy existing pdf and just change content of barcode and table ?
What would be the best approach to create the same looking pdf but with different content ?
Whank You very much for help

If the barcode and table are static I would open it in photoshop or illustrator delete everything I dont want, Then save it as a pdf again. Then follow this guide iText - add content to existing PDF file and use it as a template to put my custom content in.
If the table and bar code are dynamically generated (each one is different) and you need to crop out content on the fly I would pull some hacky crap and draw white squares over all the content I want gone. then proceed to use it as a template.
Just my 2 cents given the information provided.

Generate PDF based on a Word document for each data item

There're a data list with hundreds of data items (suppose each item is a customer), and a predefined word document as template, the requirement is - for each data item, fill corresponding data into template fields, and generate a readonly PDF file as result.
Prefered platform is ASP.NET with C#.
I found two solutions:
Change the word document into a PDF form, and use iTextSharp to fill the form fields. But create the PDF form with correct format (font, layout, etc.) is a difficult work, and it needs particular tool and new skill when system user wants to add new template (unless the PDF form is always created by developer).
Add text placeholder in the word file, and the program can read word file, replace text, and convert into PDF. But I'm not sure which components should be used.
I'd like to get some advices on this problem. tks.
Update 20130416:
After some searching & experiments, my conclusion is below:
Client solution: use Microsoft.Office.Interop.Word (Office2007+plugin or Office2012) to read data, convert to pdf, etc. But this method running on server side may be unsafe.
Server solution:
Make PDF form, and use iTextSharp to fill the form fields. The disadvantage has been mentioned above.
Make HTML template, and replace field placeholders, and use iTextSharp+XMLWorker to convert HTML to PDF. The difficulty is create the HTML template manually and optimize the PDF effect.
MS SharePoint Office Automation Service is a server solution based on MS Office, perhaps this method will be easier, but it needs license and SharePoint server cost.
Finally, I chose the HTML template solution for this request. QED.

Another option would be to use Tx Text Control for ASP.NET. They have a
mailmerge feature that allows you to fill data into a word template.
The merged document can easily be saved as a pdf.

For the second option you can use iTextsharp or Aspose which supports the placeholder replacement and generation PDF, it supports creating files based on templates of MSWord and Openoffice which could be usefull for user who do not want to buy MSWord only to create a template.

Another option, you can use nustache templates, fill them with list data and then use xmlworker from ItextSharp to render to pdf.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.