GOAL
To open an existing PDF file with multiple pages and add background image to all pages. (Optionally the background image of the first page differs from the others)
In my current implementation (I use .NET 6 and PDFsharp btw.) I add the image to each page, which increases the size of the file dependent on the number of pages.
QUESTION
Is there a way in PDFsharp/MigraDoc to embed a background image only once into the document and then reference it for each page?
CODE
Both PDF document and the image come from a database as byte arrays.
public byte[] AddBackgroundImgToDocument(byte[] doc, byte[] imgFirstPage, byte[]? imgOtherPages=null)
{
using var ms = new MemoryStream(doc);
PdfDocument pdfDoc = PdfReader.Open(ms, PdfDocumentOpenMode.Modify);
for (int i = 0; i < pdfDoc.PageCount; i++)
{
if(i > 0 && imgOtherPages != null && imgOtherPages.Length > 0)
AddBackgroundImageFromByteArray(pdfDoc.Pages[i], imgOtherPages);
else
AddBackgroundImageFromByteArray(pdfDoc.Pages[i], imgFirstPage);
GC.Collect();
GC.WaitForPendingFinalizers();
}
using var oms = new MemoryStream();
pdfDoc.Save(oms);
ms.Dispose();
pdfDoc.Dispose();
return oms.ToArray();
}
public void AddBackgroundImageFromByteArray(PdfPage page, byte[] imgfile)
{
XGraphics gfx = XGraphics.FromPdfPage(page, XGraphicsPdfPageOptions.Prepend);
MemoryStream ms = new System.IO.MemoryStream(imgfile);
ms.Position = 0;
XImage image = XImage.FromStream(() => ms);
gfx.DrawImage(image, 0, 0, page.Width, page.Height);
ms.Dispose();
}
SOLUTION
Rewriting the method above according to accepted answer, solved my problem:
public void AddBackgroundImageFromByteArray(PdfPage page, byte[] imgfile)
{
if(!ximageLoaded)
{
MemoryStream ms = new System.IO.MemoryStream(imgfile);
ms.Position = 0;
backimg = XImage.FromStream(() => ms);
ms.Dispose();
ximageLoaded = true;
}
XGraphics gfx = XGraphics.FromPdfPage(page, XGraphicsPdfPageOptions.Prepend);
gfx.DrawImage(backimg, 0, 0, page.Width, page.Height);
}
With PDFsharp and MigraDoc this optimization is done automatically if you use them as intended.
Load the image once with PDFsharp and add it to as many pages as you like, there will be only one copy of the image in the document.
Related
This was my code for itextsharp which worked ok. It displayed "Quote Only" in the middle of each page in a pdf file.
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(Server.MapPath(#"~\Content\WaterMarkQuoteOnly.png"));
PdfReader readerOriginalDoc = new PdfReader(File(all, "application/pdf").FileContents);
int n = readerOriginalDoc.NumberOfPages;
img.SetAbsolutePosition(0, 300);
PdfGState _state = new PdfGState()
{
FillOpacity = 0.1F,
StrokeOpacity = 0.1F
};
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(readerOriginalDoc, ms, '\0', true))
{
for (int i = 1; i <= n; i++)
{
PdfContentByte content = stamper.GetOverContent(i);
content.SaveState();
content.SetGState(_state);
content.AddImage(img);
content.RestoreState();
}
}
//return ms.ToArray();
all = ms.GetBuffer();
}
This is my new itext 7 code, this also displays the watermark but the position is wrong. I was dismayed to see that you cant add an image to the canvas but you have to add ImageData when the position is being set on the image. The image is also way smaller and back to front.
var imagePath = Server.MapPath(#"~\Content\WaterMarkQuoteOnly.png");
var tranState = new iText.Kernel.Pdf.Extgstate.PdfExtGState();
tranState.SetFillOpacity(0.1f);
tranState.SetStrokeOpacity(0.1f);
ImageData myImageData = ImageDataFactory.Create(imagePath, false);
Image img = new Image(myImageData);
img.SetFixedPosition(0, 300);
var reader = new PdfReader(new MemoryStream(all));
var doc = new PdfDocument(reader);
int pages = doc.GetNumberOfPages();
using (var ms = new MemoryStream())
{
var writer = new PdfWriter(ms);
var newdoc = new PdfDocument(writer);
for (int i = 1; i <= pages; i++)
{
//get existing page
PdfPage page = doc.GetPage(i);
//copy page to new document
newdoc.AddPage(page.CopyTo(newdoc)); ;
//get our new page
PdfPage newpage = newdoc.GetPage(i);
Rectangle pageSize = newpage.GetPageSize();
//get canvas based on new page
var canvas = new PdfCanvas(newpage);
//write image data to new page
canvas.SaveState().SetExtGState(tranState);
canvas.AddImage(myImageData, pageSize, true);
canvas.RestoreState();
}
newdoc.Close();
all = ms.GetBuffer();
ms.Flush();
}
You are doing something strange with the PdfDocument objects, and you are also using the wrong AddImage() method.
I am not a C# developer, so I rewrote your example in Java. I took this PDF file:
And I took this image:
Then I added the image to the PDF file using transparency with the following result:
The code to do this, was really simple:
public void createPdf(String src, String dest) throws IOException {
PdfExtGState tranState = new PdfExtGState();
tranState.setFillOpacity(0.1f);
ImageData img = ImageDataFactory.create(IMG);
PdfReader reader = new PdfReader(src);
PdfWriter writer = new PdfWriter(dest);
PdfDocument pdf = new PdfDocument(reader, writer);
for (int i = 1; i <= pdf.getNumberOfPages(); i++) {
PdfPage page = pdf.getPage(i);
PdfCanvas canvas = new PdfCanvas(page);
canvas.saveState().setExtGState(tranState);
canvas.addImage(img, 36, 600, false);
canvas.restoreState();
}
pdf.close();
}
For some reason, you created two PdfDocument instances. This isn't necessary. You also used the AddImage() method passing a Rectangle which resizes the image. Also make sure that you don't add the image as an inline image, because that bloats the file size.
I don't know which programming language you are using. For instance: I am not used to variables that are created using var such as var tranState. It should be very easy for you to adapt my Java code though. It's just a matter of changing lowercases into uppercases.
I am new to using iTextSharp and working with Pdf files in general, but I think I'm on the right track.
I iterate through a list of pdf files, convert them to bytes, and push all of the resulting bytes into a byte array. From there I pass the byte array to concatAndAddContent() to merge all of the pdf's into a single large pdf. Currently I'm just getting the last pdf in the list (they seem to be overwriting)
public static byte[] concatAndAddContent(List<byte[]> pdfByteContent)
{
byte[] allBytes;
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document();
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
doc.SetPageSize(PageSize.LETTER);
doc.Open();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage page;
PdfReader reader;
foreach (byte[] p in pdfByteContent)
{
reader = new PdfReader(p);
int pages = reader.NumberOfPages;
// loop over document pages
for (int i = 1; i <= pages; i++)
{
doc.SetPageSize(PageSize.LETTER);
doc.NewPage();
page = writer.GetImportedPage(reader, i);
cb.AddTemplate(page, 0, 0);
}
}
doc.Close();
allBytes = ms.GetBuffer();
ms.Flush();
ms.Dispose();
}
return allBytes;
}
Above is the working code that results in a single pdf being created, and the rest of the files are being ignored. Any suggestions
This is pretty much just a C# version of Bruno's code here.
This is pretty much the simplest, safest and recommended way to merge PDF files. The PdfSmartCopy object is able to detect redundancies in the multiple files which can reduce file size some times. One of the overloads on it accepts a full PdfReader object which can be instantiated however you want.
public static byte[] concatAndAddContent(List<byte[]> pdfByteContent) {
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var copy = new PdfSmartCopy(doc, ms)) {
doc.Open();
//Loop through each byte array
foreach (var p in pdfByteContent) {
//Create a PdfReader bound to that byte array
using (var reader = new PdfReader(p)) {
//Add the entire document instead of page-by-page
copy.AddDocument(reader);
}
}
doc.Close();
}
}
//Return just before disposing
return ms.ToArray();
}
}
List<byte[]> finallist= new List<byte[]>();
finallist.Add(concatAndAddContent(bytes));
System.IO.File.WriteAllBytes("path",finallist);
So I have been able to take a multi-page TIFF file and convert it to a single jpeg image but it flattens the TIFF. By flatten it, I mean it only returns the first page. The goal is to retrieve the TIFF (via memory stream), open each page of the TIFF and append it to a new jpeg (or any web image). Thus creating one long image to view on the web without the aid of a plugin. I do have the MODI.dll installed but I am not sure how to use it in this instance but it is an option.
Source Code (using a FileHandler):
#region multi-page tiff to single page jpeg
var byteFiles = dfSelectedDocument.File.FileBytes; // <-- FileBytes is a byte[] or byte array source.
byte[] jpegBytes;
using( var inStream = new MemoryStream( byteFiles ) )
using( var outStream = new MemoryStream() ) {
System.Drawing.Image.FromStream( inStream ).Save( outStream, ImageFormat.Jpeg );
jpegBytes = outStream.ToArray();
}
context.Response.ContentType = "image/JPEG";
context.Response.AddHeader( "content-disposition",
string.Format( "attachment;filename=\"{0}\"",
dfSelectedDocument.File.FileName.Replace( ".tiff", ".jpg" ) )
);
context.Response.Buffer = true;
context.Response.BinaryWrite( jpegBytes );
#endregion
I'm guessing that you'll have to loop over each frame in the TIFF.
Here's an excerpt from Split multi page tiff file:
private void Split(string pstrInputFilePath, string pstrOutputPath)
{
//Get the frame dimension list from the image of the file and
Image tiffImage = Image.FromFile(pstrInputFilePath);
//get the globally unique identifier (GUID)
Guid objGuid = tiffImage.FrameDimensionsList[0];
//create the frame dimension
FrameDimension dimension = new FrameDimension(objGuid);
//Gets the total number of frames in the .tiff file
int noOfPages = tiffImage.GetFrameCount(dimension);
ImageCodecInfo encodeInfo = null;
ImageCodecInfo[] imageEncoders = ImageCodecInfo.GetImageEncoders();
for (int j = 0; j < imageEncoders.Length; j++)
{
if (imageEncoders[j].MimeType == "image/tiff")
{
encodeInfo = imageEncoders[j];
break;
}
}
// Save the tiff file in the output directory.
if (!Directory.Exists(pstrOutputPath))
Directory.CreateDirectory(pstrOutputPath);
foreach (Guid guid in tiffImage.FrameDimensionsList)
{
for (int index = 0; index < noOfPages; index++)
{
FrameDimension currentFrame = new FrameDimension(guid);
tiffImage.SelectActiveFrame(currentFrame, index);
tiffImage.Save(string.Concat(pstrOutputPath, #"\", index, ".TIF"), encodeInfo, null);
}
}
}
You should be able to adapt the logic above to append onto your JPG rather than create separate files.
have you compressed the jpeg?
https://msdn.microsoft.com/en-us/library/bb882583(v=vs.110).aspx
In case you get the dreadful "A generic error occurred in GDI+" error (which is arguably the Rickroll of all errors) when using the SelectActiveFrame method suggested in the other answers, I strongly suggest to use the System.Windows.Media.Imaging.TiffBitmapDecoder class instead (you will need to add a Reference to the PresentationCore.dll framework library).
Here's an example code that does just that (it puts all the TIFF frames into a list of standard Bitmaps):
List<System.Drawing.Bitmap> bmpLst = new List<System.Drawing.Bitmap>();
using (var msTemp = new MemoryStream(data))
{
TiffBitmapDecoder decoder = new TiffBitmapDecoder(msTemp, BitmapCreateOptions.PreservePixelFormat, BitmapCacheOption.Default);
int totFrames = decoder.Frames.Count;
for (int i = 0; i < totFrames; ++i)
{
// Create bitmap to hold the single frame
System.Drawing.Bitmap bmpSingleFrame = BitmapFromSource(decoder.Frames[i]);
// add the frame (as a bitmap) to the bitmap list
bmpLst.Add(bmpSingleFrame);
}
}
And here's the BitmapFromSource helper method:
public static Bitmap BitmapFromSource(BitmapSource bitmapsource)
{
Bitmap bitmap;
using (var outStream = new MemoryStream())
{
BitmapEncoder enc = new BmpBitmapEncoder();
enc.Frames.Add(BitmapFrame.Create(bitmapsource));
enc.Save(outStream);
bitmap = new Bitmap(outStream);
}
return bitmap;
}
For further info regarding this workaround, I also suggest to read this blog post I wrote.
In code, I am in the process of created a PDF document using iTextSharp. I have already added content to the document and have closed the document, successfully retrieving it in a response to a web browser.
What I am trying to do is append another PDF document to the one I am creating but it has to come from binary or an object of type Byte[].
I realize that there is the available method document.Add(stuff) but I am trying to convert the binary to an object and then essentially add that to the document in progress. I have seen questions and posts similar to my scenario but they are mostly dealing with Images.
Here is what I have...
while (sqlExpDocDataReader.Read())
{
// Read data and fill temp. objects
string docName = sqlExpDocDataReader["docName"].ToString();
string docType = sqlExpDocDataReader["docType"].ToString();
Byte[] docData = (Byte[])sqlExpDocDataReader["docData"];
// Get current page size
var pageWidth = document.PageSize.Width;
var pageHeight = document.PageSize.Height;
// Is this an image or PDF?
if (docType.Contains("pdf"))
{
// Could I use a memeory stream some how?
MemoryStream ms = new MemoryStream(docData.ToArray());
}
else
{
// Here I see how to do it with images.
Image doc = Image.GetInstance(docData);
doc.ScaleToFit(pageWidth, pageHeight); // width, height
document.Add(doc);
}
}
Any ideas?
With a bit more digging, here is how I was able to resolve my issue...
Basically, I created a MemoryStream object from my binary data and then created a PdfReader to read that object, where normally we would read a file.
I then looped through each page of the reader object (or file if you'd like) and appended them as they where found.
if (docType.Contains("pdf"))
{
MemoryStream ms = new MemoryStream(docData.ToArray());
PdfReader pdfReader = new PdfReader(ms);
for (int i = 1; i <= pdfReader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(pdfReader, i);
document.Add(iTextSharp.text.Image.GetInstance(page));
}
}
public static byte[] UnificarImagenesPDF(IEnumerable<DocumentoDTO> documentos)// "documents" is a list of objects that are located in the database, the images and pdf are stored in a binary attribute of "documents"
{
using (MemoryStream workStream = new MemoryStream())
{
iTextSharp.text.Document doc = new iTextSharp.text.Document();//to create a itextSharp Document
PdfWriter writer = PdfWriter.GetInstance(doc, workStream);
doc.Open();
foreach (DocumentoDTO d in documentos)// "documentos" has an attribute where the document extension type is saved (eg pdf, jpg, png, etc)
{
try
{
if (d.sExtension == ".pdf")
{
MemoryStream ms = new MemoryStream(d.bBinarios.ToArray());
PdfReader pdfReader = new PdfReader(ms); //
for (int i = 1; i <= pdfReader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(pdfReader, i);
doc.Add(resizeImagen(iTextSharp.text.Image.GetInstance(page)));//Each sheet of the PDF document is added to the document created in itextsharp, and the resizeImage function is used so that the images are centered in the ITEXTSHARP document
doc.NewPage();// add a new page on ITEXTSHARP document
}
}
if (d.sExtension != ".pdf")
{
doc.Add(resizeImagen(Image.GetInstance((byte[])d.bBinarios)));
doc.NewPage();
}
}
catch
{ }
}
doc.Close();
writer.Close();
return workStream.ToArray();
}
}
private static iTextSharp.text.Image resizeImagen(iTextSharp.text.Image image)
{
if (image.Height > image.Width)
{
//Maximum height is 800 pixels.
float percentage = 0.0f;
percentage = 700 / image.Height;
image.ScalePercent(percentage * 100);
}
else
{
//Maximum width is 600 pixels.
float percentage = 0.0f;
percentage = 540 / image.Width;
image.ScalePercent(percentage * 100);
}
return image;
}
I am trying to create a Bitmap from a stream that has a PDF document saved with-in the stream, but I keep getting argument null exception. The MS is not null and it is positioned at 0. So I'm lost as to what else to do.
I'm testing the functionality by using a windows forms application sandbox but I cannot get the memory stream to save to a Bitmap.
Can someone point out to me where I'm going wrong?
private async void Form1_Load(object sender, EventArgs e)
{
//4355,4373
IElevation elev = await ElevationManager.GetElevationAsync(4355);
PdfSharp.Pdf.PdfDocument pdfDoc =
await (await AlumCloudPlans.Manager.GetLabelsAsync(elev)).GetPDF(new SheetInfo(/*settings for PDF, Img printing is different*/3, 10, 240, 95, 780, 1000));
System.IO.Stream ms = new MemoryStream();
pdfDoc.Save(ms, false);
ms.Position = 0;
Bitmap bm = new Bitmap(ms); <---------(Error right here, says argument null)
this.AutoScroll = true;
this.pictureBox1.Image = bm;
this.pictureBox1.BackColor = Color.White;
this.Size = new System.Drawing.Size(bm.Width, (bm.Height + 50) / 2);
this.pictureBox1.Size = new System.Drawing.Size(bm.Width, bm.Height + 5);
}
What am i missing here?
You're trying to open stream containing a PDF document, Bitmap constructor is expecting image data. PDFSharp can only create PDF document, but can't render it to image.
To render PDF document to bitmap, you have to use other libraries, e.g. lib-pdf, mupdf-converter or .NET wrapper for Ghostscript.