On the server running SQL Server Reporting Services, I'm able to run a report and get back a document with symbols included within (☎ and ✉). I'm even able to save the resulting page as a PDF - special characters intact.
The problem turns up when I create the PDF in C# code using ServerReport.Render() and all the special characters get turned into little empty squares ( and ).
I tried adding <HumanReadablePDF>true</HumanReadablePDF> into <Extension Name="PDF" Type="Microsoft.ReportingServices.Rendering.ImageRenderer.PDFRenderer,Microsoft.ReportingServices.ImageRendering"> in the \Reporting Services\ReportServer\rsreportserver.config file but that hasn't helped either.
Is there something I'm missing the in the configuration of the SSRS server? Is there another way of accomplishing this, perhaps with a special font or by parsing and replacing text with images?
Related
I've been tasked with creating "Signature" fields within PDF files that we generate. Some of these files are actually various word docs that get pieced together (if they apply) that are then converted to PDF's. Because of this, the position of the "Signature" field could be anywhere (vertically) on the page. We currently have iTextSharp version 4.1.6 and have no intentions of upgrading. I have successfully used this library for many of our other forms when I know the exact position of the field prior to processing.
The route we are thinking of taking is placing some white text where we would want the Signature field located. I would parse the PDF searching for this text and then place a field there. I am by no means a PDF expert and have spent some time looking into the various tokens within the file hoping that there is something I can use but have come up with no answers.
Using this outdated version, is there any way I am able to come up with a solution to finding and placing a field or am I wasting my time?
I am working on an application where it is necessary that the text that a user enters be formatted, be it bold, italic, point forms etc. For this reason I have elected to use a https://summernote.org/
Everything is fine and gets saved to my database. The problem is, when pulling up a crystal reports I am unsure how to do that without pulling in the tags and basically raw information that is in the database.
I've seen various links such as this on How can I represent data in a WYSIWYG format without using Crystal Reports?
That seems to suggest it can be done however I am not seeing where or what im supposed to change to get it ton work correctly.
Would be greatful if someone could point it out for me
Right-click the field, Format Field...
Select Paragraph tab
Set Text Interpretation to 'HTML Text'.
This would handle basic HTML formatting. For more advanced HTML scenarios, there's a solution via a UFL (user function library).
I have been using the (free version) HiQPdf libraries to converto html pages to pdf documents.
I am also using in my pages several Chart objecs from the .Net Framework (System.Web.UI.DataVisualization.Charting.Chart) to produce bar graphics dysplaying values changing along time. It works wonders in my local environment when I debug with VS, but when I publish on my IIS or on other servers the charts do not appear at all on the pdf - note: they do appear on the webpages just not on the pdf.
In the Pdfs all the html is displayed correctly, inlcuding css, showing it exactly as seen on the page, except the chart images. I kind of understand that they would not appear as in the html, the image source from the charts results in something like:
<img id="MainContent_MyPageControl_ctl00" src="/MyTestWebSite/ChartImg.axd?i=charts_0/chart_0_2.png&g=396d61e14ceb41c08be06fd956cd4dca"
Because the real generated png image is not even directly referenced as usual. But the fact is they do appear on the generated pdf when running local from VS, which produces a similar html anyway as the above.
Only difference I see is that when I run local the Image Chart Handler key is defined in the webconfig as:
add key="ChartImageHandler" value="storage=file;timeout=20;dir=c:\TempImageFiles\;"
but when publishing on a server I have changed it to:
add key="ChartImageHandler" value="storage=file;timeout=20;Url=~/MyfolderTempcharts/;deleteAfterServicing=false;"
So why dont the charts appear too on my generated pdf, or why they do appear if Im running from VS?
Anyone ever had this problem?
So I have figured out a way to deal with my problem, and I am sharing here if anyone is also in need. The deal is that the Chart objects use http handlers to render the chart generated images (you can read more about it in detail from these guys: https://web.archive.org/web/20201205231110/https://www.4guysfromrolla.com/articles/081909-1.aspx)
At first I thought I could manipulate the html (that I get before the page render) and replace the image sources with accessible links to send to the HiQPdf - for that I would need to temporarely copy the generated images to a public folder. That would be fine unless in my case I deal with sensitive data that should never be accessed even if only momentarily.
So what I do is I check that the image file does exists (in my configured folder) and extract the byte data from it, convert it to base 64 and replace it in the image source html where the handler is, and voila.
example of Base64 html embeded image:
Notice that this manipulated html is never accessible from the outside and is only worked on on the server side, in order to provide a "normal" html that the HiQPdf can pprocess.
I need your expertise in fixing a problem I have been facing from a week. This has already turned into a 'royal pain in the lower back side' category and time is running out fast.
Problem
I have developed a C# script that I call from ColdFusion to assist me in converting Word documents to PDF. This script is doing the conversion properly, but the (justified) text in the paragraphs is not being spaced properly. I get a non-select-able space next to some character.
See the image -
What is should look like...
What it looks like...
The red marks are added to show the spaces created.
Now, if I open the file by word manually and save it, I do not get this same problem. What is that I'm missing or doing wrong, that has resulted in this error?
Details of my application flow -
I create a DOC (based on my design needs) and save it as HTML.
This HTML will be used by my CF application to manipulate the content based on some placeholders and the final output is again saved as HTML.
The xx.html file is renamed to xx.doc and passed to my C# based converter, which does the doc to pdf convertion via Word Automation.
I ponder in joy seeing my well formed PDF output, but get sad that the text is a bit messy.
I have tried this with multiple fonts and what i observe is that it only happens with certain fonts (in my case its Palatino - Linotype). I want to know, what is the difference from manual to automation? Is there a setting (like a boolean) that is to done for this or some other hacks?
My system configuration -
Windows 2008 R2 64b + .NET 4 + Office 2010
Note: I know that office automation is bad. So on this date and time, this is the only option I have to get my job done.
I found a work-around for this. It seems to be dependent on the selected printer!
First go to the print dialog (File / Print) and select "Microsoft XPS Document Writer" instead of your normal printer. You don't need to print anything,
Now export the PDF (File / Export / Create PDF)
Selecting other printer drivers may work also. I found this solution at this thread: http://www.howtofixcomputers.com/forums/microsoft-office/bad-kerning-pdf-using-save-pdf-xps-add-244886.html
Notes:
I also installed Adobe PDF Writer before finding this. It's possible that affected it.
My system is Windows 8.1 & Office 2013 running under Fusion 5.0.3 on a Mac mini.
I guess that the trouble could be in used font. Please try:
change font
ensure, that language of the text (LanguageID Property) is correct
Or it could be inserted special character, for example, wrong way interpreted inserted "no-width optional break". Try to select the text, cut&paste in word and see non-printable characters - it should be visible.
I have a PDF and want to extract the text contained in it. I've tried a few different PDF libraries and they all return basically the same results. When extracting the text from a two page document with literally hundreds of words, only a dozen or so words from the header are returned.
Is there any way to tell if the text I'm after is actually text or a raster image of the text? I'm thinking something along the lines of Firebug's "Inspect Element" but at this point I'll take any solution that tells what I'm really looking at.
This project really doesn't justify attempting to use OCR. And, although a simple solution, using fields in the PDF is not an option since the generator of the file is a third party.
If Acrobat/Reader can select the text, then it Is Text.
Reasons your library might not be able to find the text in question:
Complex/bad fonts or encodings. Adobe can be very forgiving of garbage in, somehow managing to get Good Info out.
The text could be in an annotation rather than the page contents. It won't matter what program parses the content stream if you need to look in the annot array instead.
You didn't name a particular library, so it's possible that the library you're using doesn't look inside XObject Forms. That's unlikely in an even remotely mature API, but stranger things have happened.
If you can get away with copy/pasta from Reader, then just go that route.
Have you tried Amyuni PDF Creator .Net? It allows you to enumerate all components from a specified rectangular region of a page and inspect their type from a predefined types list. You could run a quick test using the trial version and the following code sample for text extraction:
// open a PDF file
axPDFCreactiveX1.Open(System.IO.Directory.GetCurrentDirectory()+"\\sampleBookmarks.pdf", "");
axPDFCreactiveX1.Refresh ();
String text = axPDFCreactiveX1.GetRawPageText (1);
MessageBox.Show (text);
Additionally, it provides Tesseract OCR integration in case you needed it.
Disclaimer: I am part of the development team of this product.
Check this site out. It may contain some helpful code snippets. http://www.codeproject.com/KB/cs/PDFToText.aspx