Splite PDF by Size using IText7

Splite PDF by Size using IText7 - c#

I have found a inbuilt method to split pdf by size, however i am not sure about the parameter as it is asking for parameter to split the pdf (long size) and not sure using the parameter the split file in kb, mb. Can someone please suggest me the usage of this parameter, support i want to split the file of 95MB into 5MB files, so what value i need to pass as parameter i.e. 5000, 500, 50 or 5.
Code:
PdfReader reader = new PdfReader(fileName);
PdfDocument sourceDoc = new PdfDocument(reader);
IList<PdfDocument> pdfDocs = new CustomPdfSplitter(sourceDoc, outputFileName).SplitBySize(fileSize);
Code for CustomPdfSplitter:
private class CustomPdfSplitter : PdfSplitter
{
private String dest;
private int partNumber = 1;
public CustomPdfSplitter(PdfDocument pdfDocument, String dest) : base(pdfDocument)
{
this.dest = dest;
}
protected override PdfWriter GetNextPdfWriter(PageRange documentPageRange)
{
return new PdfWriter(dest.Replace(".pdf","_" + (partNumber++).ToString("0000")+".pdf"));
}
}
public string SplitPDFBySize(string fileName, string outputFileName, int fileSize)
{
StringBuilder outputMsg = new StringBuilder();
PdfReader reader = new PdfReader(fileName);
PdfDocument sourceDoc = new PdfDocument(reader);
long fileSizeinByte = fileSize * 1000000;
IList<PdfDocument> pdfDocs = new CustomPdfSplitter(sourceDoc, outputFileName).SplitBySize(fileSizeinByte);
try
{
PBChild.Value = 0;
PBChild.Maximum = pdfDocs.Count;
for (int i = 0; i < pdfDocs.Count; i++)
{
string splittedFile = outputFileName.Replace(".pdf", "_" + (i + 1).ToString("0000") + ".pdf");
pdfDocs[i].Close();
outputMsg.AppendLine(fileName + " splitted with output file: " + splittedFile);
PBChild.Value++;
}
}
catch (Exception ex)
{
outputMsg.AppendLine("Error: " + fileName + "||" + ex.Message);
}
finally
{
sourceDoc.Close();
}
return outputMsg.ToString();
}

SplitBySize take as parameter size specified in bytes. So, to split document into 5 MB documents you need to pass 5*2^20 as parameter.
Although, you're right that the documentation was slightly uninformative. We've already fixed that and you can find these fixes in the SNAPSHOT version.

Related

Convert PDF to TIFF using ImageMagick & C#

I have an existing program that does some processing a .pdf file and splitting it into multiple .pdf files based on looking for barcodes on the pages.
The program uses ImageMagick and C#.
I want to change it from outputting pdfs to outputting tifs. Look for the comment in the code below for where I would guess the change would be made.
I included the ImageMagick tag because someone might offer a commandline option that someone else can help me convert to C#.
private void BurstPdf(string bigPdfName, string targetfolder)
{
bool outputPdf = true; // change to false to output tif.
string outputExtension = "";
var settings = new MagickReadSettings { Density = new Density(200) };
string barcodePng = Path.Combine("C:\TEMP", "tmp.png");
using (MagickImageCollection pdfPageCollection = new MagickImageCollection())
{
pdfPageCollection.Read(bigPdfName, settings);
int inputPageCount = 0;
int outputPageCount = 0;
int outputFileCount = 0;
MagickImageCollection resultCollection = new MagickImageCollection();
string barcode = "";
string resultName = "";
IBarcodeReader reader = new BarcodeReader();
reader.Options.PossibleFormats = new List<BarcodeFormat>();
reader.Options.PossibleFormats.Add(BarcodeFormat.CODE_39);
reader.Options.TryHarder = false;
foreach (MagickImage pdfPage in pdfPageCollection)
{
MagickGeometry barcodeArea = getBarCodeArea(pdfPage);
IMagickImage barcodeImg = pdfPage.Clone();
barcodeImg.ColorType = ColorType.Bilevel;
barcodeImg.Depth = 1;
barcodeImg.Alpha(AlphaOption.Off);
barcodeImg.Crop(barcodeArea);
barcodeImg.Write(barcodePng);
inputPageCount++;
using (var barcodeBitmap = new Bitmap(barcodePng))
{
var result = reader.Decode(barcodeBitmap);
if (result != null)
{
// found a first page because it has bar code.
if (result.BarcodeFormat.ToString() == "CODE_39")
{
if (outputFileCount != 0)
{
// write out previous pages.
if (outputPdf) {
outputExtension = ".pdf";
} else {
// What do I put here to output a g4 compressed tif?
outputExtension = ".tif";
}
resultName = string.Format("{0:D4}", outputFileCount) + "-" + outputPageCount.ToString() + "-" + barcode + outputExtension;
resultCollection.Write(Path.Combine(targetfolder, resultName));
resultCollection = new MagickImageCollection();
}
barcode = standardizePhysicalBarCode(result.Text);
outputFileCount++;
resultCollection.Add(pdfPage);
outputPageCount = 1;
}
else
{
Console.WriteLine("WARNING barcode is not of type CODE_39 so something is wrong. check page " + inputPageCount + " of " + bigPdfName);
if (inputPageCount == 1)
{
throw new Exception("barcode not found on page 1. see " + barcodePng);
}
resultCollection.Add(pdfPage);
outputPageCount++;
}
}
else
{
if (inputPageCount == 1)
{
throw new Exception("barcode not found on page 1. see " + barcodePng);
}
resultCollection.Add(pdfPage);
outputPageCount++;
}
}
if (File.Exists(barcodePng))
{
File.Delete(barcodePng);
}
}
if (resultCollection.Count > 0)
{
if (outputPdf) {
outputExtension = ".pdf";
} else {
// What do I put here to output a g4 compressed tif?
outputExtension = ".tif";
}
resultName = string.Format("{0:D4}", outputFileCount) + "-" + outputPageCount.ToString() + "-" + barcode + outputExtension;
resultCollection.Write(Path.Combine(targetfolder, resultName));
outputFileCount++;
}
}
}
[EDIT] The above code is what I am using (which some untested modifications) to split a .pdf into other .pdfs. I want to know how to modify this code to output tiffs. I put a comment in the code where I think the change would go.
[EDIT] So encouraged by #fmw42 I just ran the code with the .tif extension enabled. Looks like it did convert to a .tif, but the tif is not compressed. I am surprised that IM just configures the output based on the extension name of the file. Handy I guess, but just seems a little loose.
[EDIT] I figured it out. Although counter-intuitive ones sets the compression on the read of the file. I am reading a .pdf but I set the compression to Group for like this:
var settings = new MagickReadSettings { Density = new Density(200), Compression = CompressionMethod.Group4 };
The thing I learned was that simply naming the output file .tif tells IM to output a tif. That is a handy way to do it, but it just seems sloppy.

ItextSharp modify takes time

I have a pdf template and I want to add an "option action" to that template, this should be done for all of the Recipients I have, which they're 4000.
This means I should end up with 4000 pdf files for each recipient.
But it takes a very long time, approx 500 files/12 mins.
Note: the template size is 340 KB
I built a windows service to do this job, and this is the code that does the job:
private static byte[] GeneratePdfFromPdfFile(byte[] file, string landingPage, string code)
{
try
{
using (var ms = new MemoryStream())
{
var reader = new PdfReader(file);
var stamper = new PdfStamper(reader, ms);
string _embeddedURL = "http://" + landingPage + "/Default.aspx?code=" + code + "&m=" + eventCode18;
PdfAction act = new PdfAction(_embeddedURL);
stamper.Writer.SetOpenAction(act);
stamper.Close();
reader.Close();
return ms.ToArray();
}
}
catch (Exception ex)
{
return null;
}
}
public static byte[] GenerateAttachment(AttachmentExtenstion type, string Contents, string FileName, string code, string landingPage, bool zipped, byte[] File = null)
{
byte[] finalVal = null;
try
{
switch (type)
{
case AttachmentExtenstion.PDF:
finalVal = GeneratePdfFromPdfFile(File, landingPage, code);
break;
case AttachmentExtenstion.WordX:
case AttachmentExtenstion.Word:
finalVal = GenerateWordFromDocFile(File, code, landingPage);
break;
case AttachmentExtenstion.HTML:
finalVal = GenerateHtmlFile(Contents, code, landingPage);
break;
}
return zipped ? _getZippedFile(finalVal, FileName) : finalVal;
}
catch (Exception ex)
{
return null;
}
finally
{
GC.Collect();
}
}
the GenerateAttachment Method is being called for each Recipient because the action is based on the recipient's id, and I'm using this approach to iterate through the List to speed up the process
Parallel.ForEach(listRecipients.AsEnumerable(),
new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount * 2 },
(item) =>
{
File.WriteAllBytes(AppDomain.CurrentDomain.BaseDirectory + #"Attachments\" + group.Code + #"\" + item.Scenario.CampaignCode + #"\" + item.CMPRCode + "." + extension,
GenerateAttachment(_type, value, item.AttachmentName, item.CMPRCode, item.Scenario.LandingDomain, item.Scenario.AttachmentZip.Value));
});

c# load txt file and split it to X files based on number of lines

this is the code that i've written so far...
it doesnt do the job except re-write every line on the same file over and over again...
*RecordCntPerFile = 10K
*FileNumberName = 1 (file number one)
*Full File name should be something like this: 1_asci_split
string FileFullPath = DestinationFolder + "\\" + FileNumberName + FileNamePart + FileExtension;
using (System.IO.StreamReader sr = new System.IO.StreamReader(SourceFolder + "\\" + SourceFileName))
{
for (int i = 0; i <= (RecordCntPerFile - 1); i++)
{
using (StreamWriter sw = new StreamWriter(FileFullPath))
{
{ sw.Write(sr.Read() + "\n"); }
}
}
FileNumberName++;
}
Dts.TaskResult = (int)ScriptResults.Success;
}

If I understood correctly, you want to split a big file in smaller files with maximum of 10k lines. I see 2 problems on your code:
You never change the FullFilePath variable. So you will always rewrite on the same file
You always read and write the whole source file to the target file.
I rewrote your code to fit the behavior I said earlier. You just have to modify the strings.
int maxRecordsPerFile = 10000;
int currentFile = 1;
using (StreamReader sr = new StreamReader("source.txt"))
{
int currentLineCount = 0;
List<string> content = new List<string>();
while (!sr.EndOfStream)
{
content.Add(sr.ReadLine());
if (++currentLineCount == maxRecordsPerFile || sr.EndOfStream)
{
using (StreamWriter sw = new StreamWriter(string.Format("file{0}.txt", currentFile)))
{
foreach (var line in content)
sw.WriteLine(line);
}
content = new List<string>();
currentFile++;
currentLineCount = 0;
}
}
}
Of course you can do better than that, as you don't need to create that string and do that foreach loop. I just made this quick example to give you the idea. To improve the performance is up to you

How to merge multiple pdf files (generated in run time)?

How to merge multiple pdf files (generated on run time) through ItextSharp then printing them.
I found the following link but that method requires the pdf names considering that the pdf files stored and this is not my case .
I have multiple reports i'll convert them to pdf files through this method :
private void AddReportToResponse(LocalReport followsReport)
{
string mimeType;
string encoding;
string extension;
string[] streams = new string[100];
Warning[] warnings = new Warning[100];
byte[] pdfStream = followsReport.Render("PDF", "", out mimeType, out encoding, out extension, out streams, out warnings);
//Response.Clear();
//Response.ContentType = mimeType;
//Response.AddHeader("content-disposition", "attachment; filename=Application." + extension);
//Response.BinaryWrite(pdfStream);
//Response.End();
}
Now i want to merge all those generated files (Bytes) in one pdf file to print it

If you want to merge source documents using iText(Sharp), there are two basic situations:
You really want to merge the documents, acquiring the pages in their original format, transfering as much of their content and their interactive annotations as possible. In this case you should use a solution based on a member of the Pdf*Copy* family of classes.
You actually want to integrate pages from the source documents into a new document but want the new document to govern the general format and don't care for the interactive features (annotations...) in the original documents (or even want to get rid of them). In this case you should use a solution based on the PdfWriter class.
You can find details in chapter 6 (especially section 6.4) of iText in Action — 2nd Edition. The Java sample code can be accessed here and the C#'ified versions here.
A simple sample using PdfCopy is Concatenate.java / Concatenate.cs. The central piece of code is:
byte[] mergedPdf = null;
using (MemoryStream ms = new MemoryStream())
{
using (Document document = new Document())
{
using (PdfCopy copy = new PdfCopy(document, ms))
{
document.Open();
for (int i = 0; i < pdf.Count; ++i)
{
PdfReader reader = new PdfReader(pdf[i]);
// loop over the pages in that document
int n = reader.NumberOfPages;
for (int page = 0; page < n; )
{
copy.AddPage(copy.GetImportedPage(reader, ++page));
}
}
}
}
mergedPdf = ms.ToArray();
}
Here pdf can either be defined as a List<byte[]> immediately containing the source documents (appropriate for your use case of merging intermediate in-memory documents) or as a List<String> containing the names of source document files (appropriate if you merge documents from disk).
An overview at the end of the referenced chapter summarizes the usage of the classes mentioned:
PdfCopy: Copies pages from one or more existing PDF documents. Major downsides: PdfCopy doesn’t detect redundant content, and it fails when concatenating forms.
PdfCopyFields: Puts the fields of the different forms into one form. Can be used to avoid the problems encountered with form fields when concatenating forms using PdfCopy. Memory use can be an issue.
PdfSmartCopy: Copies pages from one or more existing PDF documents. PdfSmartCopy is able to detect redundant content, but it needs more memory and CPU than PdfCopy.
PdfWriter: Generates PDF documents from scratch. Can import pages from other PDF documents. The major downside is that all interactive features of the imported page (annotations, bookmarks, fields, and so forth) are lost in the process.

I used iTextsharp with c# to combine pdf files. This is the code I used.
string[] lstFiles=new string[3];
lstFiles[0]=#"C:/pdf/1.pdf";
lstFiles[1]=#"C:/pdf/2.pdf";
lstFiles[2]=#"C:/pdf/3.pdf";
PdfReader reader = null;
Document sourceDocument = null;
PdfCopy pdfCopyProvider = null;
PdfImportedPage importedPage;
string outputPdfPath=#"C:/pdf/new.pdf";
sourceDocument = new Document();
pdfCopyProvider = new PdfCopy(sourceDocument, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
//Open the output file
sourceDocument.Open();
try
{
//Loop through the files list
for (int f = 0; f < lstFiles.Length-1; f++)
{
int pages =get_pageCcount(lstFiles[f]);
reader = new PdfReader(lstFiles[f]);
//Add pages of current file
for (int i = 1; i <= pages; i++)
{
importedPage = pdfCopyProvider.GetImportedPage(reader, i);
pdfCopyProvider.AddPage(importedPage);
}
reader.Close();
}
//At the end save the output file
sourceDocument.Close();
}
catch (Exception ex)
{
throw ex;
}
private int get_pageCcount(string file)
{
using (StreamReader sr = new StreamReader(File.OpenRead(file)))
{
Regex regex = new Regex(#"/Type\s*/Page[^s]");
MatchCollection matches = regex.Matches(sr.ReadToEnd());
return matches.Count;
}
}

Here is some code I pulled out of an old project I had. It was a web application but I was using iTextSharp to merge pdf files then print them.
public static class PdfMerger
{
/// <summary>
/// Merge pdf files.
/// </summary>
/// <param name="sourceFiles">PDF files being merged.</param>
/// <returns></returns>
public static byte[] MergeFiles(List<Stream> sourceFiles)
{
Document document = new Document();
MemoryStream output = new MemoryStream();
try
{
// Initialize pdf writer
PdfWriter writer = PdfWriter.GetInstance(document, output);
writer.PageEvent = new PdfPageEvents();
// Open document to write
document.Open();
PdfContentByte content = writer.DirectContent;
// Iterate through all pdf documents
for (int fileCounter = 0; fileCounter < sourceFiles.Count; fileCounter++)
{
// Create pdf reader
PdfReader reader = new PdfReader(sourceFiles[fileCounter]);
int numberOfPages = reader.NumberOfPages;
// Iterate through all pages
for (int currentPageIndex = 1; currentPageIndex <=
numberOfPages; currentPageIndex++)
{
// Determine page size for the current page
document.SetPageSize(
reader.GetPageSizeWithRotation(currentPageIndex));
// Create page
document.NewPage();
PdfImportedPage importedPage =
writer.GetImportedPage(reader, currentPageIndex);
// Determine page orientation
int pageOrientation = reader.GetPageRotation(currentPageIndex);
if ((pageOrientation == 90) || (pageOrientation == 270))
{
content.AddTemplate(importedPage, 0, -1f, 1f, 0, 0,
reader.GetPageSizeWithRotation(currentPageIndex).Height);
}
else
{
content.AddTemplate(importedPage, 1f, 0, 0, 1f, 0, 0);
}
}
}
}
catch (Exception exception)
{
throw new Exception("There has an unexpected exception" +
" occured during the pdf merging process.", exception);
}
finally
{
document.Close();
}
return output.GetBuffer();
}
}
/// <summary>
/// Implements custom page events.
/// </summary>
internal class PdfPageEvents : IPdfPageEvent
{
#region members
private BaseFont _baseFont = null;
private PdfContentByte _content;
#endregion
#region IPdfPageEvent Members
public void OnOpenDocument(PdfWriter writer, Document document)
{
_baseFont = BaseFont.CreateFont(BaseFont.HELVETICA,
BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
_content = writer.DirectContent;
}
public void OnStartPage(PdfWriter writer, Document document)
{ }
public void OnEndPage(PdfWriter writer, Document document)
{ }
public void OnCloseDocument(PdfWriter writer, Document document)
{ }
public void OnParagraph(PdfWriter writer,
Document document, float paragraphPosition)
{ }
public void OnParagraphEnd(PdfWriter writer,
Document document, float paragraphPosition)
{ }
public void OnChapter(PdfWriter writer, Document document,
float paragraphPosition, Paragraph title)
{ }
public void OnChapterEnd(PdfWriter writer,
Document document, float paragraphPosition)
{ }
public void OnSection(PdfWriter writer, Document document,
float paragraphPosition, int depth, Paragraph title)
{ }
public void OnSectionEnd(PdfWriter writer,
Document document, float paragraphPosition)
{ }
public void OnGenericTag(PdfWriter writer, Document document,
Rectangle rect, string text)
{ }
#endregion
private float GetCenterTextPosition(string text, PdfWriter writer)
{
return writer.PageSize.Width / 2 - _baseFont.GetWidthPoint(text, 8) / 2;
}
}
I didn't write this, but made some modifications. I can't remember where I found it. After I merged the PDFs I would call this method to insert javascript to open the print dialog when the PDF is opened. If you change bSilent to true then it should print silently to their default printer.
public Stream addPrintJStoPDF(Stream thePDF)
{
MemoryStream outPutStream = null;
PRStream finalStream = null;
PdfDictionary page = null;
string content = null;
//Open the stream with iTextSharp
var reader = new PdfReader(thePDF);
outPutStream = new MemoryStream(finalStream.GetBytes());
var stamper = new PdfStamper(reader, (MemoryStream)outPutStream);
var jsText = "var res = app.setTimeOut('this.print({bUI: true, bSilent: false, bShrinkToFit: false});', 200);";
//Add the javascript to the PDF
stamper.JavaScript = jsText;
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
stamper.Close();
//Set the stream to the beginning
outPutStream.Position = 0;
return outPutStream;
}
Not sure how well the above code is written since I pulled it from somewhere else and I haven't worked in depth at all with iTextSharp but I do know that it did work at merging PDFs that I was generating at runtime.

Tested with iTextSharp-LGPL 4.1.6:
public static byte[] ConcatenatePdfs(IEnumerable<byte[]> documents)
{
using (var ms = new MemoryStream())
{
var outputDocument = new Document();
var writer = new PdfCopy(outputDocument, ms);
outputDocument.Open();
foreach (var doc in documents)
{
var reader = new PdfReader(doc);
for (var i = 1; i <= reader.NumberOfPages; i++)
{
writer.AddPage(writer.GetImportedPage(reader, i));
}
writer.FreeReader(reader);
reader.Close();
}
writer.Close();
outputDocument.Close();
var allPagesContent = ms.GetBuffer();
ms.Flush();
return allPagesContent;
}
}

To avoid the memory issues mentioned, I used file stream instead of memory stream(mentioned in ITextSharp Out of memory exception merging multiple pdf) to merge pdf files:
var parentDirectory = Directory.GetParent(SelectedDocuments[0].FilePath);
var savePath = parentDirectory + "\\MergedDocument.pdf";
using (var fs = new FileStream(savePath, FileMode.Create))
{
using (var document = new Document())
{
using (var pdfCopy = new PdfCopy(document, fs))
{
document.Open();
for (var i = 0; i < SelectedDocuments.Count; i++)
{
using (var pdfReader = new PdfReader(SelectedDocuments[i].FilePath))
{
for (var page = 0; page < pdfReader.NumberOfPages;)
{
pdfCopy.AddPage(pdfCopy.GetImportedPage(pdfReader, ++page));
}
}
}
}
}
}

****/*For Multiple PDF Print..!!*/****
<button type="button" id="btnPrintMultiplePdf" runat="server" class="btn btn-primary btn-border btn-sm"
onserverclick="btnPrintMultiplePdf_click">
<i class="fa fa-file-pdf-o"></i>Print Multiple pdf</button>
protected void btnPrintMultiplePdf_click(object sender, EventArgs e)
{
if (ValidateForMultiplePDF() == true)
{
#region Declare Temp Variables..!!
CheckBox chkList = new CheckBox();
HiddenField HidNo = new HiddenField();
string Multi_fofile, Multi_listfile;
Multi_fofile = Multi_listfile = "";
Multi_fofile = Server.MapPath("PDFRNew");
#endregion
for (int i = 0; i < grdRnew.Rows.Count; i++)
{
#region Find Grd Controls..!!
CheckBox Chk_One = (CheckBox)grdRnew.Rows[i].FindControl("chkOne");
Label lbl_Year = (Label)grdRnew.Rows[i].FindControl("lblYear");
Label lbl_No = (Label)grdRnew.Rows[i].FindControl("lblCode");
#endregion
if (Chk_One.Checked == true)
{
HidNo .Value = llbl_No .Text.Trim()+ lbl_Year .Text;
if (File.Exists(Multi_fofile + "\\" + HidNo.Value.ToString() + ".pdf"))
{
#region Get Multiple Files Name And Paths..!!
if (Multi_listfile != "")
{
Multi_listfile = Multi_listfile + ",";
}
Multi_listfile = Multi_listfile + Multi_fofile + "\\" + HidNo.Value.ToString() + ".pdf";
#endregion
}
}
}
#region For Generate Multiple Pdf..!!
if (Multi_listfile != "")
{
String[] Multifiles = Multi_listfile.Split(',');
string DestinationFile = Server.MapPath("PDFRNew") + "\\Multiple.Pdf";
MergeFiles(DestinationFile, Multifiles);
Response.ContentType = "pdf";
Response.AddHeader("Content-Disposition", "attachment;filename=\"" + DestinationFile + "\"");
Response.TransmitFile(DestinationFile);
Response.End();
}
else
{
}
#endregion
}
}
private void MergeFiles(string DestinationFile, string[] SourceFiles)
{
try
{
int f = 0;
/**we create a reader for a certain Document**/
PdfReader reader = new PdfReader(SourceFiles[f]);
/**we retrieve the total number of pages**/
int n = reader.NumberOfPages;
/**Console.WriteLine("There are " + n + " pages in the original file.")**/
/**Step 1: creation of a document-object**/
Document document = new Document(reader.GetPageSizeWithRotation(1));
/**Step 2: we create a writer that listens to the Document**/
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(DestinationFile, FileMode.Create));
/**Step 3: we open the Document**/
document.Open();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage page;
int rotation;
/**Step 4: We Add Content**/
while (f < SourceFiles.Length)
{
int i = 0;
while (i < n)
{
i++;
document.SetPageSize(reader.GetPageSizeWithRotation(i));
document.NewPage();
page = writer.GetImportedPage(reader, i);
rotation = reader.GetPageRotation(i);
if (rotation == 90 || rotation == 270)
{
cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
}
else
{
cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
}
/**Console.WriteLine("Processed page " + i)**/
}
f++;
if (f < SourceFiles.Length)
{
reader = new PdfReader(SourceFiles[f]);
/**we retrieve the total number of pages**/
n = reader.NumberOfPages;
/**Console.WriteLine("There are"+n+"pages in the original file.")**/
}
}
/**Step 5: we Close the Document**/
document.Close();
}
catch (Exception e)
{
string strOb = e.Message;
}
}
private bool ValidateForMultiplePDF()
{
bool chkList = false;
foreach (GridViewRow gvr in grdRnew.Rows)
{
CheckBox Chk_One = (CheckBox)gvr.FindControl("ChkSelectOne");
if (Chk_One.Checked == true)
{
chkList = true;
}
}
if (chkList == false)
{
divStatusMsg.Style.Add("display", "");
divStatusMsg.Attributes.Add("class", "alert alert-danger alert-dismissable");
divStatusMsg.InnerText = "ERROR !!...Please Check At Least On CheckBox.";
grdRnew.Focus();
set_timeout();
return false;
}
return true;
}

JPG to PDF Convertor in C#

I would like to convert from an image (like jpg or png) to PDF.
I've checked out ImageMagickNET, but it is far too complex for my needs.
What other .NET solutions or code are there for converting an image to a PDF?

Easy with iTextSharp:
class Program
{
static void Main(string[] args)
{
Document document = new Document();
using (var stream = new FileStream("test.pdf", FileMode.Create, FileAccess.Write, FileShare.None))
{
PdfWriter.GetInstance(document, stream);
document.Open();
using (var imageStream = new FileStream("test.jpg", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
var image = Image.GetInstance(imageStream);
document.Add(image);
}
document.Close();
}
}
}

iTextSharp does it pretty cleanly and is open source. Also, it has a very good accompanying book by the author which I recommend if you end up doing more interesting things like managing forms. For normal usage, there are plenty resources on mailing lists and newsgroups for samples of how to do common things.
EDIT: as alluded to in #Chirag's comment, #Darin's answer has code that definitely compiles with current versions.
Example usage:
public static void ImagesToPdf(string[] imagepaths, string pdfpath)
{
using(var doc = new iTextSharp.text.Document())
{
iTextSharp.text.pdf.PdfWriter.GetInstance(doc, new FileStream(pdfpath, FileMode.Create));
doc.Open();
foreach (var item in imagepaths)
{
iTextSharp.text.Image image = iTextSharp.text.Image.GetInstance(item);
doc.Add(image);
}
}
}

Another working code, try it
public void ImagesToPdf(string[] imagepaths, string pdfpath)
{
iTextSharp.text.Rectangle pageSize = null;
using (var srcImage = new Bitmap(imagepaths[0].ToString()))
{
pageSize = new iTextSharp.text.Rectangle(0, 0, srcImage.Width, srcImage.Height);
}
using (var ms = new MemoryStream())
{
var document = new iTextSharp.text.Document(pageSize, 0, 0, 0, 0);
iTextSharp.text.pdf.PdfWriter.GetInstance(document, ms).SetFullCompression();
document.Open();
var image = iTextSharp.text.Image.GetInstance(imagepaths[0].ToString());
document.Add(image);
document.Close();
File.WriteAllBytes(pdfpath+"cheque.pdf", ms.ToArray());
}
}

One we've had great luck with is PDFSharp (we use it for TIFF and Text to PDF conversion for hundreds of medical claims every day).
http://pdfsharp.com/PDFsharp/

Such task can be easily done with help of Docotic.Pdf library.
Here is a sample that creates PDF from given images (not only JPGs, actually):
public static void imagesToPdf(string[] images, string pdfName)
{
using (PdfDocument pdf = new PdfDocument())
{
for (int i = 0; i < images.Length; i++)
{
if (i > 0)
pdf.AddPage();
PdfPage page = pdf.Pages[i];
string imagePath = images[i];
PdfImage pdfImage = pdf.AddImage(imagePath);
page.Width = pdfImage.Width;
page.Height = pdfImage.Height;
page.Canvas.DrawImage(pdfImage, 0, 0);
}
pdf.Save(pdfName);
}
}
Disclaimer: I work for the vendor of the library.

If you want to do it in a cross-platform way, without any thirty part library,
or paying any license, you can use this code.
It takes an array of pictures (I think it only works only with jpg) with its sizes and return a pdf file, with one picture per page.
You have to create two files:
File Picture:
using System;
using System.Collections.Generic;
using System.Text;
namespace PDF
{
public class Picture
{
private byte[] data;
private int width;
private int height;
public byte[] Data { get => data; set => data = value; }
public int Width { get => width; set => width = value; }
public int Height { get => height; set => height = value; }
}
}
File PDFExport:
using System;
using System.Collections.Generic;
namespace PDF
{
public class PDFExport
{
private string company = "Your Company Here";
public sbyte[] createFile(List<Picture> pictures)
{
int N = (pictures.Count + 1) * 3;
string dateTimeStr = DateTime.Now.ToString("yyyyMMddhhmmss");
string file1 =
"%PDF-1.4\n";
string file2 =
"2 0 obj\n" +
"<<\n" +
"/Type /Pages\n" +
getKids(pictures) +
"/Count " + pictures.Count + "\n" +
">>\n" +
"endobj\n" +
"1 0 obj\n" +
"<<\n" +
"/Type /Catalog\n" +
"/Pages 2 0 R\n" +
"/PageMode /UseNone\n" +
"/PageLayout /SinglePage\n" +
"/Metadata 7 0 R\n" +
">>\n" +
"endobj\n" +
N + " 0 obj\n" +
"<<\n" +
"/Creator(" + company + ")\n" +
"/Producer(" + company + ")\n" +
"/CreationDate (D:" + dateTimeStr + ")\n" +
"/ModDate (D:" + dateTimeStr + ")\n" +
">>\n" +
"endobj\n" +
"xref\n" +
"0 " + (N + 1) + "\n" +
"0000000000 65535 f\n" +
"0000224088 00000 n\n" +
"0000224031 00000 n\n" +
"0000000015 00000 n\n" +
"0000222920 00000 n\n" +
"0000222815 00000 n\n" +
"0000224153 00000 n\n" +
"0000223050 00000 n\n" +
"trailer\n" +
"<<\n" +
"/Size " + (N + 1) + "\n" +
"/Root 1 0 R\n" +
"/Info 6 0 R\n" +
">>\n" +
"startxref\n" +
"0\n" +
"%% EOF";
sbyte[] part1 = file1.GetBytes();
sbyte[] part2 = file2.GetBytes();
List<sbyte[]> fileContents = new List<sbyte[]>();
fileContents.Add(part1);
for (int i = 0; i < pictures.Count; i++)
{
fileContents.Add(getPageFromImage(pictures[i], i));
}
fileContents.Add(part2);
return getFileContent(fileContents);
}
private string getKids(List<Picture> pictures)
{
string kids = "/Kids[";
for (int i = 0; i < pictures.Count; i++)
{
kids += (3 * (i + 1) + 1) + " 0 R ";
}
kids += "]\n";
return kids;
}
private sbyte[] getPageFromImage(Picture picture, int P)
{
int N = (P + 1) * 3;
string imageStart =
N + " 0 obj\n" +
"<<\n" +
"/Type /XObject\n" +
"/Subtype /Image\n" +
"/Width " + picture.Width + "\n" +
"/Height " + picture.Height + "\n" +
"/BitsPerComponent 8\n" +
"/ColorSpace /DeviceRGB\n" +
"/Filter /DCTDecode\n" +
"/Length " + picture.Data.Length + "\n" +
">>\n" +
"stream\n";
string dimentions = "q\n" +
picture.Width + " 0 0 " + picture.Height + " 0 0 cm\n" +
"/X0 Do\n" +
"Q\n";
string imageEnd =
"\nendstream\n" +
"endobj\n" +
(N + 2) + " 0 obj\n" +
"<<\n" +
"/Filter []\n" +
"/Length " + dimentions.Length + "\n" +
">>\n" +
"stream\n";
string page =
"\nendstream\n" +
"endobj\n" +
(N + 1) + " 0 obj\n" +
"<<\n" +
"/Type /Page\n" +
"/MediaBox[0 0 " + picture.Width + " " + picture.Height + "]\n" +
"/Resources <<\n" +
"/XObject <<\n" +
"/X0 " + N + " 0 R\n" +
">>\n" +
">>\n" +
"/Contents 5 0 R\n" +
"/Parent 2 0 R\n" +
">>\n" +
"endobj\n";
List<sbyte[]> fileContents = new List<sbyte[]>();
fileContents.Add(imageStart.GetBytes());
fileContents.Add(byteArrayToSbyteArray(picture.Data));
fileContents.Add(imageEnd.GetBytes());
fileContents.Add(dimentions.GetBytes());
fileContents.Add(page.GetBytes());
return getFileContent(fileContents);
}
private sbyte[] byteArrayToSbyteArray(byte[] data)
{
sbyte[] data2 = new sbyte[data.Length];
for (int i = 0; i < data2.Length; i++)
{
data2[i] = (sbyte)data[i];
}
return data2;
}
private sbyte[] getFileContent(List<sbyte[]> fileContents)
{
int fileSize = 0;
foreach (sbyte[] content in fileContents)
{
fileSize += content.Length;
}
sbyte[] finaleFile = new sbyte[fileSize];
int index = 0;
foreach (sbyte[] content in fileContents)
{
for (int i = 0; i < content.Length; i++)
{
finaleFile[index + i] = content[i];
}
index += content.Length;
}
return finaleFile;
}
}
}
You can use the code in this easy way
///////////////////////////////////////Export PDF//////////////////////////////////////
private sbyte[] exportPDF(List<Picture> images)
{
if (imageBytesList.Count > 0)
{
PDFExport pdfExport = new PDFExport();
sbyte[] fileData = pdfExport.createFile(images);
return fileData;
}
return null;
}

You need Acrobat to be installed. Tested on Acrobat DC. This is a VB.net code. Due to that these objects are COM objects, you shall do a 'release object', not just a '=Nothing". You can convert this code here: https://converter.telerik.com/
Private Function ImageToPDF(ByVal FilePath As String, ByVal DestinationFolder As String) As String
Const PDSaveCollectGarbage As Integer = 32
Const PDSaveLinearized As Integer = 4
Const PDSaveFull As Integer = 1
Dim PDFAVDoc As Object = Nothing
Dim PDFDoc As Object = Nothing
Try
'Check destination requirements
If Not DestinationFolder.EndsWith("\") Then DestinationFolder += "\"
If Not System.IO.Directory.Exists(DestinationFolder) Then Throw New Exception("Destination directory does not exist: " & DestinationFolder)
Dim CreatedFile As String = DestinationFolder & System.IO.Path.GetFileNameWithoutExtension(FilePath) & ".pdf"
'Avoid conflicts, therefore previous file there will be deleted
If File.Exists(CreatedFile) Then File.Delete(CreatedFile)
'Get PDF document
PDFAVDoc = GetPDFAVDoc(FilePath)
PDFDoc = PDFAVDoc.GetPDDoc
If Not PDFDoc.Save(PDSaveCollectGarbage Or PDSaveLinearized Or PDSaveFull, CreatedFile) Then Throw New Exception("PDF file cannot be saved: " & PDFDoc.GetFileName())
If Not PDFDoc.Close() Then Throw New Exception("PDF file could not be closed: " & PDFDoc.GetFileName())
PDFAVDoc.Close(1)
Return CreatedFile
Catch Ex As Exception
Throw Ex
Finally
System.Runtime.InteropServices.Marshal.ReleaseComObject(PDFDoc)
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(PDFDoc)
PDFDoc = Nothing
System.Runtime.InteropServices.Marshal.ReleaseComObject(PDFAVDoc)
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(PDFAVDoc)
PDFAVDoc = Nothing
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
End Try
End Function

not sure if you're looking for just free / open source solutions or considering commercial ones as well. But if you're including commercial solutions, there's a toolkit called EasyPDF SDK that offers an API for converting images (plus a number of other file types) to PDF. It supports C# and can be found here:
http://www.pdfonline.com/
The C# code would look as follows:
Printer oPrinter = new Printer();
ImagePrintJob oPrintJob = oPrinter.ImagePrintJob;
oPrintJob.PrintOut(imageFile, pdfFile);
To be fully transparent, I should disclaim that I do work for the makers of EasyPDF SDK (hence my handle), so this suggestion is not without some personal bias :) But feel free to check out the eval version if you're interested. Cheers!

I use Sautinsoft, its very simple:
SautinSoft.PdfMetamorphosis p = new SautinSoft.PdfMetamorphosis();
p.Serial="xxx";
p.HtmlToPdfConvertStringToFile("<html><body><img src=\""+filename+"\"></img></body></html>","output.pdf");

You may try to convert any Images to PDF using this code sample:
PdfVision v = new PdfVision();
ImageToPdfOptions options = new ImageToPdfOptions();
options.JpegQuality = 95;
try
{
v.ConvertImageToPdf(new string[] {inpFile}, outFile, options);
System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(outFile) { UseShellExecute = true });
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
Console.ReadLine();
}
Or if you need to convert Image Class to PDF:
System.Drawing.Image image = Image.FromFile(#"..\..\image-jpeg.jpg");
string outFile = new FileInfo(#"Result.pdf").FullName;
PdfVision v = new PdfVision();
ImageToPdfOptions options = new ImageToPdfOptions();
options.PageSetup.PaperType = PaperType.Auto;
byte[] imgBytes = null;
using (MemoryStream ms = new System.IO.MemoryStream())
{
image.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
imgBytes = ms.ToArray();
}
try
{
v.ConvertImageToPdf(imgBytes, outFile, options);
System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo(outFile) { UseShellExecute = true });
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
Console.ReadLine();
}

Many diff tools out there. One I use is PrimoPDF (FREE) http://www.primopdf.com/ you go to print the file and you print it to pdf format onto your drive. works on Windows

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Splite PDF by Size using IText7 - c#

SplitBySize take as parameter size specified in bytes. So, to split document into 5 MB documents you need to pass 5*2^20 as parameter. Although, you're right that the documentation was slightly uninformative. We've already fixed that and you can find these fixes in the SNAPSHOT version.

Related

Convert PDF to TIFF using ImageMagick & C#

ItextSharp modify takes time

c# load txt file and split it to X files based on number of lines

How to merge multiple pdf files (generated in run time)?

JPG to PDF Convertor in C#

Categories

Resources