Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Can I merge two or more PDFs in asp.net? I know I can do Word and Excel files using interop. But can I merge PDFs?
Please suggest any suggestions or any links.
Try iTextSharp:
iTextSharp is a C# port of iText, and open source Java library for
PDF generation and manipulation. It can be used to create PDF
documents from scratch, to convert XML to PDF (using the extra XFA
Worker DLL), to fill out interactive PDF forms, to stamp new content
on existing PDF documents, to split and merge existing PDF documents,
and much more.
Here's an article on how to do it.
using System.Text.RegularExpressions;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
using iTextSharp.text;
//Call this method in main with parameter
public static void MergePages(string outputPdfPath, string[] lstFiles)
{
PdfReader reader = null;
Document sourceDocument = null;
PdfCopy pdfCopyProvider = null;
PdfImportedPage importedPage;
sourceDocument = new Document();
pdfCopyProvider = new PdfCopy(sourceDocument,
new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
sourceDocument.Open();
try
{
for (int f = 0; f < lstFiles.Length - 1; f++)
{
int pages = 1;
reader = new PdfReader(lstFiles[f]);
//Add pages of current file
for (int i = 1; i <= pages; i++)
{
importedPage = pdfCopyProvider.GetImportedPage(reader, i);
pdfCopyProvider.AddPage(importedPage);
}
reader.Close();
}
sourceDocument.Close();
}
catch (Exception ex)
{
throw ex;
}
}
Related
I am basically splitting a PDF document into multiple documents containing one page each. After splitting I perform some operations and the merge the documents back to a single PDF. I am using PDFsharp in c# to do this. Now the problem I am facing is that when I split the document and then add them back, the file size increases from 1.96Mbs to 12.2Mbs. Now after thoroughly testing, I have pointed out that the problem lies not in the operations which I performing after splitting but in the actual splitting and merging of PDF documents. The following are my functions which I have created.
public static List<Stream> SplitPdf(Stream PdfDoc)
{
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
List<Stream> outputStreamList = new List<Stream>();
PdfSharp.Pdf.PdfDocument inputDocument = PdfReader.Open(PdfDoc, PdfDocumentOpenMode.Import);
for (int idx = 0; idx < inputDocument.PageCount; idx++)
{
PdfSharp.Pdf.PdfDocument outputDocument = new PdfSharp.Pdf.PdfDocument();
outputDocument.Version = inputDocument.Version;
outputDocument.Info.Title =
String.Format("Page {0} of {1}", idx + 1, inputDocument.Info.Title);
outputDocument.Info.Creator = inputDocument.Info.Creator;
outputDocument.AddPage(inputDocument.Pages[idx]);
MemoryStream stream = new MemoryStream();
outputDocument.Save(stream);
outputStreamList.Add(stream);
}
return outputStreamList;
}
public static Stream MergePdfs(List<Stream> PdfFiles)
{
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
PdfSharp.Pdf.PdfDocument outputPDFDocument = new PdfSharp.Pdf.PdfDocument();
foreach (Stream pdfFile in PdfFiles)
{
PdfSharp.Pdf.PdfDocument inputPDFDocument = PdfReader.Open(pdfFile, PdfDocumentOpenMode.Import);
outputPDFDocument.Version = inputPDFDocument.Version;
foreach (PdfSharp.Pdf.PdfPage page in inputPDFDocument.Pages)
{
outputPDFDocument.AddPage(page);
}
}
Stream compiledPdfStream = new MemoryStream();
outputPDFDocument.Save(compiledPdfStream);
return compiledPdfStream;
}
The question which I have is:
Why am I getting this behaviour?
Is there a solution where I can perform split and merge and then get the file of same size? (Can be of any open-source c# library)
Replying to question 1:
When splitting the files, every file will contain all resources required by the pages it contains.
When merging with PDFsharp again, resources will not be merged and the final document may contain duplicated resources (fonts, images), thus leading to larger files.
This is by design.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 months ago.
Improve this question
I've got a request from a stakeholder who wants us to automate the following procedure.
Go to the CMS website (In-house application)
Take a picture of the report.
Send an email to stakeholders with the reports attached.
Note: This procedure must be repeated on a daily basis.
And I'm not sure which project to choose for the above need; at the moment, all I can think of is a Console Application, but I'm not sure much about it.
Any assistance would be much appreciated.
Code For Screenshot - Selenium C#
public class ScreenShotRepository
{
public static void TakeScreenShot(IWebDriver Driver, string filename, List<string> text = null)
{
var bytesArr = Driver.TakeScreenshot(new VerticalCombineDecorator(new ScreenshotMaker()));
var screenshotImage = (System.Drawing.Image)((new ImageConverter()).ConvertFrom(bytesArr));
WriteToPDF(new List<System.Drawing.Image>() { screenshotImage }, filename, text);
}
public static void WriteToPDF(List<System.Drawing.Image> screenshots, string filename, List<string> text)
{
var fileStream = new FileStream(filename, FileMode.Create, FileAccess.Write, FileShare.None);
var document = new Document(new iTextSharp.text.Rectangle(0, 0, screenshots[0].Width, screenshots[0].Height), 0, 0, 0, 0);
var writer = PdfWriter.GetInstance(document, fileStream);
document.Open();
var content = writer.DirectContent;
var font = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
for (int i = 0; i < screenshots.Count; i++)
{
var image = iTextSharp.text.Image.GetInstance(screenshots[i], screenshots[i].RawFormat);
document.Add(image);
WriteText(content, font, text);
if (i + 1 != screenshots.Count)
document.NewPage();
}
document.Close();
writer.Close();
}
public static void WriteText(PdfContentByte content, BaseFont font, List<string> text)
{
content.BeginText();
content.SetColorFill(BaseColor.GREEN);
content.SetFontAndSize(font, 40);
for (int j = 0; j < text.Count; j++)
content.ShowTextAligned(Element.ALIGN_LEFT, text[j].ToString(), 50, 50 + 50 * j, 0);
content.EndText();
}
}
You could make this a Windows Service, because of the daily call requirement.
However, the simplest way is indeed a console application that you schedule to run using your operating systems task scheduler.
And as far as the requirements go, why can't the reporting system output a PDF? Taking a screenshot of another software is already a really makeshift solution if it were third-party, taking screenshots of your own reporting software just says whoever programs the inhouse CMS system is... not up to the task if there is a requirement to automate it outside of their domain.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I know people asked similar questions already. But the solutions are not what I am looking for. For my case, our GridView holds at least a million of records. In addition, our customer doesn't like the warning message from excel 2007. Because of the warning message, we cannot use the most common way that uses GridView.RenderControl(). So we decided to try OpenXML. But from all the sample codes I have found, to create an excel file using OpenXML, it seems that you have to loop each row&column of the GridView and write to each cell of an excel file. It will take a good amount of time. Does any one know if there is better/faster solution? Also, we cannot use third party DLLs because of security reason. Thanks.
here is a method that I use to Export DataTable to Excel I created a class public static class Extensions to house these methods
internal static void ExportToXcel_MyDataTable(DataTable dt, string fileName, Page page)
{
var recCount = dt.Rows.Count;
RemoveHtmlSpecialChars(dt);
fileName = string.Format(fileName, DateTime.Now.ToString("MMddyyyy_hhmmss"));
var xlsx = new XLWorkbook();
var ws = xlsx.Worksheets.Add("Some Report Name");
ws.Style.Font.Bold = true;
ws.Cell("C5").Value = "MY TEST EXCEL REPORT";
ws.Cell("C5").Style.Font.FontColor = XLColor.Black;
ws.Cell("C5").Style.Font.SetFontSize(16.0);
ws.Cell("E5").Value = DateTime.Now.ToString("MM/dd/yyyy HH:mm");
ws.Range("C5:E5").Style.Font.SetFontSize(16.0);
ws.Cell("A7").Value = string.Format("{0} Records", recCount);
ws.Style.Font.Bold = false;
ws.Cell(9, 1).InsertTable(dt.AsEnumerable());
ws.Row(9).InsertRowsBelow(1);
// ws.Style.Font.FontColor = XLColor.Gray;
ws.Columns("1-8").AdjustToContents();
ws.Tables.Table(0).ShowAutoFilter = true;
ws.Style.Alignment.Horizontal = XLAlignmentHorizontalValues.Center;
DynaGenExcelFile(fileName, page, xlsx);
}
private static void DynaGenExcelFile(string fileName, Page page, XLWorkbook xlsx)
{
page.Response.ClearContent();
page.Response.ClearHeaders();
page.Response.ContentType = "application/vnd.ms-excel";
page.Response.AppendHeader("Content-Disposition", string.Format("attachment;filename={0}.xlsx", fileName));
using (MemoryStream memoryStream = new MemoryStream())
{
xlsx.SaveAs(memoryStream);
memoryStream.WriteTo(page.Response.OutputStream);
}
page.Response.Flush();
page.Response.End();
}
If you have Html / special characters in the DataTable this method will remove them replacing the row data with string.Empty
/// <summary>
/// Remove all HTML special characters from datatable field if they are present
/// </summary>
/// <param name="dt"></param>
private static void RemoveHtmlSpecialChars(DataTable dt)
{
for (int rows = 0; rows < dt.Rows.Count; rows++)
{
for (int column = 0; column < dt.Columns.Count; column++)
{
dt.Rows[rows][column] = dt.Rows[rows][column].ToString().Replace(" ", string.Empty);
}
}
}
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am trying to find a way to write a purchase order data in a PDF file. Can any one suggest me anything on this?
I can't afford to buy a 3rd party DLL so want a suggestion for a free DLL or any other way.
I tried this one: (http://www.codeproject.com/Articles/7627/PDF-Library-for-creating-PDF-with-tables-and-text) but it didn't help.
Use iTextSharp is a C# port of iText, and open source Java library for PDF generation and manipulation. It can be used to create PDF documents from scratch, to convert XML to PDF (using the extra XFA Worker DLL), to fill out interactive PDF forms, to stamp new content on existing PDF documents, to split and merge existing PDF documents, and much more.
Features
PDF generation
PDF manipulation (stamping watermarks, merging/splitting PDFs,...)
PDF form filling
XML functionality
Digital signatures
Like wise Create It will Work
string Filepath = Server.MapPath("/AOF.pdf"); var pdfpath = Path.Combine(Filepath, "");
var Formcontent = ListFieldNames(Filepath);
Formcontent["Name_txt"] = "T.Test";
FillForm(Formcontent);
// var pdfContents = FillForm(pdfpath, Formcontent);
}
public Dictionary<string, string> ListFieldNames(string Filepath)
{
//PdfReader pdfReader = new PdfReader(pdfTemplate);
//StringBuilder sb = new StringBuilder();
//foreach(DictionaryEntry de in pdfReader.AcroFields.Fields)
//{
// sb.Append(de.Key.ToString() + Environment.NewLine);
//}
var Fileds = new Dictionary<string, string>();
PdfReader pdfReader = new PdfReader(Filepath);
var reader = new PdfReader(pdfReader);
foreach (var entry in reader.AcroFields.Fields)
Fileds.Add(entry.Key.ToString(), string.Empty);
reader.Close();
return Fileds;
}
public byte[] FillForm(string pdfPath, Dictionary<string, string> formFieldMap)
{
var output = new MemoryStream();
var reader = new PdfReader(pdfPath);
var stamper = new PdfStamper(reader, output);
var formFields = stamper.AcroFields;
foreach (var fieldName in formFieldMap.Keys)
formFields.SetField(fieldName, formFieldMap[fieldName]);
stamper.FormFlattening = true;
stamper.Close();
reader.Close();
return output.ToArray();
}
public void FillForm(Dictionary<string, string> Formfiledmap)
{
string pdfTemplate =Server.MapPath("/AOF.pdf");
string newFile = #"C:\Users\USer\Desktop\completed_fw4.pdf";
PdfReader pdfReader = new PdfReader(pdfTemplate);
PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileStream(newFile, FileMode.Create));
AcroFields pdfFormFields = pdfStamper.AcroFields;
foreach(var fieldName in Formfiledmap.Keys)
pdfFormFields.SetField(fieldName,Formfiledmap[fieldName]);
pdfStamper.FormFlattening = true;
pdfStamper.Close();
}
You can use iText library for .NET
you can find some useful information here
http://www.ujihara.jp/iTextdotNET/en/examples.html
download the iText Library from
http://itextpdf.com/
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
How do I extract an image from a pdf file, using c#? Thanks!
You could use iTextSharp. Here's an example.
Docotic.Pdf library can be used to extract images from PDFs.
Here is a sample that shows how to iterate trough pages and extract all images from each PDF page:
static void ExtractImagesFromPdfPages()
{
string path = "";
using (PdfDocument pdf = new PdfDocument(path))
{
for (int i = 0; i < pdf.Pages.Count; i++)
{
for (int j = 0; j < pdf.Pages[i].Images.Count; j++)
{
string imageName = string.Format("page{0}-image{1}", i, j);
string imagePath = pdf.Pages[i].Images[j].Save(imageName);
}
}
}
}
The library won't resample images. It will save them exactly the same as in PDF.
Disclaimer: I work for Bit Miracle, vendor of the library.