I want to add watermark to Pdf Stream using Itext7 - c#

this is my code that i am using now
var ms = new MemoryStream();
var htmmml = #"<h1>some html string </h1>";
// pdfHTML specific code
ConverterProperties converterProperties = new ConverterProperties();
MemoryStream pdfStream = new MemoryStream(ms.ToArray());
HtmlConverter.ConvertToPdf(htmmml, ms, converterProperties);
PdfDocument pdfDocument = new PdfDocument(new PdfReader(pdfStream), new PdfWriter(pdfStream));
// Document to add layout elements: paragraphs, images etc
Document document = new Document(pdfDocument);
// Load image from disk
ImageData imageData = ImageDataFactory.Create(#"D:\TestWebApp\TestWebApp\imgs\WATERMARK.jpeg");
// Create layout image object and provide parameters. Page number = 1
Image image = new Image(imageData).ScaleAbsolute(100, 200).SetFixedPosition(1, 25, 25);
// This adds the image to the page
document.Add(image);
thing is after converting html to ms , ms is being disposed and i cant access it anymore. i want to convert html to pdf with watermark .

It probably would be more convenient for you to use HtmlConverter's convertToDocument or convertToElements methods.
The former one returns a Document instance which you can then process (for example, add a watermark).
The latter one returns the list of html elements which construct the html file. Then you can create a Document and add these elements to this document.

When a MemoryStream is closed, you can still retrieve its contents using the ToArray method, see the note in the documentation:
MemoryStream.ToArray Method
...
This method returns a copy of the contents of the MemoryStream as a byte array. If the current instance was constructed on a provided byte array, a copy of the section of the array to which this instance has access is returned. See the MemoryStream constructor for details.
Note
This method works when the MemoryStream is closed.
Thus, you should be allowed to switch the order of your lines
MemoryStream pdfStream = new MemoryStream(ms.ToArray());
HtmlConverter.ConvertToPdf(htmmml, ms, converterProperties);
to
HtmlConverter.ConvertToPdf(htmmml, ms, converterProperties);
MemoryStream pdfStream = new MemoryStream(ms.ToArray());
to have pdfStream contain the HtmlConverter.ConvertToPdf for further processing.

Related

Merging Crystal Reports with itext7 in MVC produces unmerged pdf from stream

I'm trying to merge two (or more) Crystal Reports in an ASP.net MVC project and I downloaded the itext7 NuGet package to do so. I'm trying to put together a simple proof-of-concept in which I concatenate a pdf with itself in a single method:
var rpt1 = new CrystalDecisions.CrystalReports.Engine.ReportDocument();
var rpt2 = new CrystalDecisions.CrystalReports.Engine.ReportDocument();
rpt1.Load(Server.MapPath("~/Reports/MyReport.rpt");
rpt2.Load(Server.MapPath("~/Reports/MyReport.rpt");
DataTable table = GetDataMethod();
rpt1.SetDataSource(table);
rpt2.SetDataSource(table);
Stream stream = rpt.ExportToStream(ExportFormatType.PortableDocFormat);
var write = new PdfWriter(stream);
var doc = new PdfDocument(write);
var merger = new PdfMerger(doc);
var doc1 = new PdfDocument(new PdfReader(rpt1.ExportToStream(ExportFormatType.PortableDocFormat)));
var doc2 = new PdfDocument(new PdfReader(rpt2.ExportToStream(ExportFormatType.PortableDocFormat)));
merger.Merge(doc1, 1, doc1.GetNumberOfPages());
merger.Merge(doc2, 1, doc2.GetNumberOfPages());
doc.CopyPagesTo(1, doc2.GetNumberOfPages(), doc2);
stream.Flush();
stream.Position = 0;
return this.File(stream, "application/pdf", "DownloadName.pdf");
You can see I'm sort of throwing everything at the wall and seeing what sticks insofar as I'm using both PdfMerger.Merger() and PdfDocument.CopyPagesTo(), and I think either of those should be sufficient to do the job by itself? (And, of course, I ran the code trying each of those by themselves as well as together.) But when I run the above code the PDF which downloads is unmerged, i.e. the report only appears once. (If I run it with two different reports, then only the first report appears.)
Now, I'm returning the stream while I'm doing all the interesting stuff with the PdfMerger and PdfDocument objects, so it makes sense to me that the stream would be unchanged. But all the examples of using iText 7 I've found return either the stream or a byte array (e.g., this StackOverflow question), so that seems to be the way it is supposed to work.
Any changes I've made to the code either have no effect, throw an error, or result in the downloaded file being unreadable by the browser (i.e. not recognized as a PDF). For example, I tried converting the stream to a byte array and returning that:
using (var ms = new MemoryStream()) {
stream.CopyTo(ms);
byte[] bytes = ms.ToArray();
return new FileContentResult(bytes, "application/pdf");
}
but the browser couldn't open the download then. The same thing happened when I tried closing the PdfDocument before returning the stream (trying it to force it to write the merge to the stream).
There is a lot of confusion with streams in your code. Normally a stream is used either for input or for output. MemoryStream can be used for both, but you need to make sure to not close it to be able to reuse it. It's often simpler and cleaner to create a new instance with the underlying bytes than reusing existing ones, especially taking into account that it does not affect the performance much as the underlying heavy array structures will be reused by new instances anyway. Here is an example of how yo distinguish between the streams. ExportToStream returns you a stream from which you can obtain the byte array with the bytes of your PDF files, then you load those documents into iText and you also create the third document that you will merge the two source documents into. Then you have to make sure to call PdfDocument#Close() to tell iText to finalize your documents and then you can fetch the resultant bytes of the merged document and pass them along, wrapping them into a stream if necessary
var rpt1 = new CrystalDecisions.CrystalReports.Engine.ReportDocument();
var rpt2 = new CrystalDecisions.CrystalReports.Engine.ReportDocument();
rpt1.Load(Server.MapPath("~/Reports/MyReport.rpt");
rpt2.Load(Server.MapPath("~/Reports/MyReport.rpt");
DataTable table = GetDataMethod();
rpt1.SetDataSource(table);
rpt2.SetDataSource(table);
var report1Stream = (MemoryStream)rpt1.ExportToStream(ExportFormatType.PortableDocFormat);
var report2Stream = (MemoryStream)rpt2.ExportToStream(ExportFormatType.PortableDocFormat);
var doc1 = new PdfDocument(new PdfReader(new MemoryStream(report1Stream.ToArray())));
var doc2 = new PdfDocument(new PdfReader(new MemoryStream(report2Stream.ToArray())));
var outStream = new MemoryStream();
var write = new PdfWriter(outStream);
var doc = new PdfDocument(write);
var merger = new PdfMerger(doc);
merger.Merge(doc1, 1, doc1.GetNumberOfPages());
merger.Merge(doc2, 1, doc2.GetNumberOfPages());
doc.Close();
doc1.Close();
doc2.Close();
return this.File(new MemoryStream(outStream.ToArray()), "application/pdf", "DownloadName.pdf");

Unable to generate readable PDF using iText 7's HtmlConverter.ConvertToDocument method

I am trying to use itext7 and itext7.pdfhtml to generate a PDF from some HTML on a server and I then return the written-to MemoryStream as a FileContentResult to the client. However, when the client receives the PDF all they get is an unopenable PDF file which, if the file extension is changed to a .txt, can be seen to contain nothing more than "%PDF-1.7%âãÏÓ".
Having experimented with HtmlConverter.ConvertToPdf I was able to get the simple content in the example below to work (at least the body of it anyway); however, I believe I need HtmlConverter.ConvertToDocument instead now since I need the ability to add a footer and set the page size and margins on the resultant PDF with settings not held within the HTML passed in (in other words I need the iText Document object to manipulate).
Here is the code I am using...
public static byte[] GeneratePdfFromHtml(Action<Document> pdfModifier)
{
//Gives the converter some very simple HTML for it to create something with!
var html = "<html><head><title>Extremely Basic Title</title></head><body>Extremely Basic Content</body></html>";
using (var workStream = new MemoryStream())
using (var pdfWriter = new PdfWriter(workStream))
using (var document = HtmlConverter.ConvertToDocument(html, pdfWriter))
{
//Passes the document to a delegated function to perform some content, margin or page size manipulation
pdfModifier(document);
//Returns the written-to MemoryStream containing the PDF.
return workStream.ToArray();
}
}
This was the version I had working but it lacks the object I need to pass to my delegate.
public static byte[] GeneratePdfFromHtml(Action<Document> pdfModifier)
{
//Gives the converter some very simple HTML for it to create something with!
var html = "<html><head><title>Extremely Basic Title</title></head><body>Extremely Basic Content</body></html>";
using (var workStream = new MemoryStream())
using (var pdfWriter = new PdfWriter(workStream))
{
HtmlConverter.ConvertToPdf(html, pdfWriter);
//No longer able to call this delegate as there is no Document object to use.
//pdfModifier(document);
//Returns the written-to MemoryStream containing the PDF.
return workStream.ToArray();
}
}
In the version you had working you used HtmlConverter.ConvertToPdf. This call internally also creates a Document object but closes it before returning.
Closing the Document object causes all data of the generated PDF still in memory to be flushed to the result stream which then gets finalized with a PDF trailer.
Thus, your working version returns a finished, complete PDF file.
In your new code, though, you use HtmlConverter.ConvertToDocument. This call returns the used Document object but does not close it: You after all still want to use it for some manipulations.
As you don't close the Document object before calling return workStream.ToArray(), you return an incomplete PDF, in your case only a PDF header section.
Thus, you have to close that Document object before retrieving the bytes from your MemoryStream, e.g. explicitly like this
using (var workStream = new MemoryStream())
using (var pdfWriter = new PdfWriter(workStream))
using (var document = HtmlConverter.ConvertToDocument(html, pdfWriter))
{
//Passes the document to a delegated function to perform some content, margin or page size manipulation
pdfModifier(document);
document.Close();
//Returns the written-to MemoryStream containing the PDF.
return workStream.ToArray();
}
or implicitly like this:
using (var workStream = new MemoryStream())
using (var pdfWriter = new PdfWriter(workStream))
{
using (var document = HtmlConverter.ConvertToDocument(html, pdfWriter))
{
//Passes the document to a delegated function to perform some content, margin or page size manipulation
pdfModifier(document);
}
//Returns the written-to MemoryStream containing the PDF.
return workStream.ToArray();
}

Merging N pdf files, created from html using ITextSharp, to another blank pdf file

I need to merge N PDF files into one. I create a blank file first
byte[] pdfBytes = null;
var ms = new MemoryStream();
var doc = new iTextSharp.text.Document();
var cWriter = new PdfCopy(doc, ms);
Later I cycle through html strings array
foreach (NBElement htmlString in someElement.Children())
{
byte[] msTempDoc = getPdfDocFrom(htmlString.GetString(), cssString.GetString());
addPagesToPdf(cWriter, msTempDoc);
}
In getPdfDocFrom I create pdf file using XMLWorkerHelper and return it as byte array
private byte[] getPdfDocFrom(string htmlString, string cssString)
{
var tempMs = new MemoryStream();
byte[] tempMsBytes;
var tempDoc = new iTextSharp.text.Document();
var tempWriter = PdfWriter.GetInstance(tempDoc, tempMs);
tempDoc.Open();
using (var msCss = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(cssString)))
{
using (var msHtml = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(htmlString)))
{
//Parse the HTML
iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(tempWriter, tempDoc, msHtml, msCss);
tempMsBytes = tempMs.ToArray();
}
}
tempDoc.Close();
return tempMsBytes;
}
Later on I try to add pages from this PDF file to the blank one.
private static void addPagesToPdf(PdfCopy mainDocWriter, byte[] sourceDocBytes)
{
using (var msOut = new MemoryStream())
{
PdfReader reader = new PdfReader(new MemoryStream(sourceDocBytes));
int n = reader.NumberOfPages;
PdfImportedPage page;
for (int i = 1; i <= n; i++)
{
page = mainDocWriter.GetImportedPage(reader, i);
mainDocWriter.AddPage(page);
}
}}
It breaks when it tries to create a PdfReader from the byte array I pass to the function. "Rebuild failed: trailer not found.; Original message: PDF startxref not found."
I used another library to work with PDF before. I passed 2 PdfDocuments as an objects and just added pages from one to another in cycle. It didn't support Css though, so I had to switch to ITextSharp.
I don't quite get the difference between PdfWriter and PdfCopy.
There a logical error in your code. When you create a document from scratch as is done in the getPdfDocFrom() method, the document isn't complete until you've triggered the Close() method. In this Close() method, a trailer is created as well as a cross-reference (xref) table. The error tells you that those are missing.
Indeed, you do call the Close() method:
tempDoc.Close();
But by the time you Close() the document, it's too late: you have already created the tempMsBytes array. You need to create that array after you close the document.
Edit: I don't know anything about C#, but if MemoryStream clears its buffer after closing it, you could use mainDocWriter.CloseStream = false; so that the MemoryStream isn't closed when you close the document.
In Java, it would be a bad idea to set the "close stream" parameter to false. When I read the answers to the question Create PDF in memory instead of physical file I see that C# probably doesn't always require this extra line.
Remark: merging files by adding PdfImportedPage instances to a PdfWriter is an example of bad taste. If you are using iTextSharp 5 or earlier, you should use PdfCopy or PdfSmartCopy to do that. If you use PdfWriter, you throw away a lot of information (e.g. link annotations).

Create a VSTO shape from byte array

I have an image that is encoded in a byte array and I would like to add it as a shape in an excel document but unfortunetly the only available function I see to do this requires me to save the image to the drive and then read it. As you see this is a really slow operation and I would like to simply read the image from the byte stream and decode it into a bitmap.
I have encoded it like this :
JpegBitmapEncoder encoder = new JpegBitmapEncoder();
encoder.Frames.Add(BitmapFrame.Create(rtb));
encoder.QualityLevel = 100;
byte[] bit = null;
using (var ms = new MemoryStream())
{
encoder.Frames.Add(BitmapFrame.Create(rtb));
encoder.Save(ms);
bit = ms.ToArray();
}
Now, how to add it to the worksheet ?
The method Shapes.AddPicture accepts only a filename and cannot read from a stream.
The Excel object model doesn't provide any method for reading a byte array and then add it as a shape. So, the only possible solution is to save the byte array as a file on the disk and then add it as a shape as you stated earlier:
to save the image to the drive and then read it.

Loading an iTextSharp Document into MemoryStream

I'm developing an ASP.NET application where I have to send an PDF based on a Table created dinamically on the page as attachment on a email. So I have a function that creates the PDF as iTextSharp Document and returns it. If i try just to save this document, it works fine but I'm having a bad time trying to make it as Stream. I tried several things already, but I always get stuck at some point.
I tried to serialize it, but appears that Document is not serializable. Then I tried to work with PdfCopy, but I couldn't find out how to use this to my problem in specific.
The code right now is like this:
//Table,string,string,Stream
//This document returns fine
Document document = Utils.GeneratePDF(table, lastBook, lastDate, Response.OutputStream);
using (MemoryStream ms = new MemoryStream())
{
PdfCopy copy = new PdfCopy(document, ms);
//Need something here to copy from one to another! OR to make document as Stream
ms.Position = 0;
//Email, Subject, Stream
Utils.SendMail(email, lastBook + " - " + lastDate, ms);
}
Try to avoid passing the native iTextSharp objects around. Either pass streams, files or bytes. I don't have an IDE in front of me right now but you should be able to do something like this:
byte[] Bytes;
using(MemoryStream ms = new MemoryStream()){
Utils.GeneratePDF(table, lastBook, lastDate, ms);
Bytes = ms.ToArray();
}
Then you can either change your Utils.SendMail() to accept a byte array or just wrap it in another stream.
EDIT
You might also be able to just do something like this in your code:
using(MemoryStream ms = new MemoryStream()){
Utils.GeneratePDF(table, lastBook, lastDate, ms);
ms.Position = 0;
Utils.SendMail(email, lastBook + " - " + lastDate, ms);
}
I did this in the past by doing something like the following:
using (Document doc = new Document())
{
MemoryStream msPDFData = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(doc, msPDFData);
doc.Open();
doc.Add(new Paragraph("I'm a pdf!");
}
If you need access to the raw data you can also do
byte[] pdfData = msPDFData.ToArray();

Categories

Resources