I have two PDF files and I want to merge two PDF files in single PDF files using IronPDF (reference from https://ironpdf.com/). Here is the code I am using
var PDFs = new List<PdfDocument>();
foreach (var file in files)
{
PDFs.Add(PdfDocument.FromFile(file));
}
PdfDocument PDF = PdfDocument.Merge(PDFs);
newFileName = Path.Combine(TEMP_PDF_FILESTORE_LOCATION, newFileName);
PDF.SaveAs(newFileName);
While merging two PDF files here is the error it showing "Could not safely read page objects from AnotherPdfFile". One of PDF can contain image in that. Some image PDF it will take some throw error.
How we can remove this error?
I got the same error (Could not safely read page objects from AnotherPdfFile) when I tried to merge PDF documents that were constructed using streams that came from another service.
In order to solve this, I had to first copy each stream into a MemoryStream and then passing in the memory stream into the PdfDocument constructor. Using memory streams, IronPdf was able to merge these.
Related
I want to find whether a text is present in the uploaded PDF file in ASP.NET c#.
using (MemoryStream str = new MemoryStream(this.docUploadField.FileBytes))
{
using (StreamReader sr = new StreamReader(str, Encoding.UTF8))
{
string line = sr.ReadToEnd();
}
}
I am getting the below as the file content when I read the contents of file.
Please help me with this
You surely need some PDF reading library.
Most famous being
IText (ITextSharp for who remembers it): https://github.com/itext/itext7-dotnet
PdfSharp: https://github.com/empira/PDFsharp
and many other free options.
With those you open pdf file and read it and take the text you need.
Usually they give you a collection of the PDF elements (paragraphs, images, etc etc, and you loop through them or use a search function to look for what you need)
I create a pdf file using pdfstamper and I want to save my pdf in two different files (change the path in the pdfStamper) do I need to create a new pdfStamper or is there a way to save in multiple places the same file
// that's my code
PdfStamper stamper = new PdfStamper(rdr, new System.IO.FileStream(path, System.IO.FileMode.Create));
If I understand you correctly - you need to put the same file in different places, right? It seems to me the most logical thing is to perform all the necessary operations on one pdf-file and then make a copy of it using method System.IO.File.Copy(path, new_path);
I've been working on an application to read images from multiple word files and store them in one single word file using Microsoft.Office.Interop.Word in C#
EDIT: I also need to save a copy of the images on the file system, so I need the image in a Bitmap or similar object.
This is my implementation so far, which works fine:
foreach (InlineShape shape in doc.InlineShapes)
{
shape.Range.Select();
if (shape.Type == WdInlineShapeType.wdInlineShapePicture)
{
doc.ActiveWindow.Selection.Range.CopyAsPicture();
ImageData = Clipboard.GetDataObject();
object _ob1 = ImageData.GetData(DataFormats.Bitmap);
bmp = (Bitmap)_ob1;
images[i++] = bmp;
/*
bmp.Save("C:\\Users\\Akshay\\Pictures\\bitmaps\\test" + i.ToString() + ".bmp");
*/
}
}
I have:
Selected the images as InlineShapes
Copied the shape into Clipboard
Stored the shape in the Clipboard in a DataObject
Extracted the shape from the DataObject in Bitmap format and stored in a Bitmap object.
I've been told to refrain from using Clipboard in Word automation and use the Word APIs instead.
I've read up on it and found an SO answer stating the same.
I looked up many implementations of reading images from Word files on MSDN, SO etc. but could not find any without using clipboard.
How do I read images from Word files using the Word APIs from Microsoft.Office.Interop.Word namespace alone without using Clipboard ?
Word documents in the Office Open XML file format store images in Base64. So it should be possible to extract that information and convert/stream it to a file. You can access the information when the document is open in the Word application using the Range.WordOpenXML property.
string shapeBase64 = shape.Range.WordOpenXML;
This will return the entire Word Open XML in the flat file OPC format. In other words, it won't contain only the picture in Base64, but the entire zip package definition as XML that surrounds it. In my quick test, the tag the contains the actual Base64 is
<pkg:binaryData>
That's a child element of
<pkg:part pkg:name="/word/media/image1.jpg" pkg:contentType="image/jpeg" pkg:compression="store">
Note that it would also be possible for you to get the entire document's WordOpenXML in one step:
document.Content.WordOpenXML
but might then need to understand the way the InlineShapes in the document body are linked to the actual information in the "media" part.
And it would be possible, of course, to work directly with the Zip Package (using the Open XML SDK, perhaps) instead of opening the document in the Word.Application.
I am trying to use a LibTiff.Net library and rewriting a merge tool TiffCP api to use memory streams.
This library has a Tiff class and by passing a stream to this class, it can merge tiff images into this stream.
For testing, I passed on a Filestream and I got what i wanted - it merged and I was able to see multipage tif.
But when I pass a MemoryStream, I am able to verify that the page data is being added to the stream as I loop through but when I write it to the file at the end, I could see only 1st page.
var mso = new MemoryStream();
var fso = new FileStream(#"C:\test\ttest.tif",FileMode.OpenOrCreate); //This works
using (Tiff outImage = Tiff.ClientOpen("custom", "w", mso, tso))
{
//...
//..
System.Drawing.Image tiffImg = System.Drawing.Image.FromStream(mso, true);
tiffImg.Save(#"C:\test\test2.tiff", System.Drawing.Imaging.ImageFormat.Tiff);
tiffImg.Dispose();
//..
//..
}
P.S: I need it in memorystream because, of some folder permissions on servers + vendor API reasons.
You probably using the memory stream before data is actually written into the stream.
Please use Tiff.Flush() method before accessing data in the memory stream. And please make sure you call Tiff.WriteDirectory() method for each page you create.
EDIT:
Please also take a look at Bob Powell's article on Generating Multi-Page TIFF files. The article shows how to use EncoderParameters to actually generate a multipage TIFF.
Using
tiffImg.Save(#"C:\test\test2.tiff", System.Drawing.Imaging.ImageFormat.Tiff);
you are probably save only first frame.
I have a requirement to generate a PDF from multiple different (Unknown page Sized PDF's)
Create a cover sheet from a template and write the text onto it.
Pull a PDF (Unknown page size) and append to the above 3) Repeat
until all required PDF's are attached
Step 1 is not a problem and this is working, so I have a a cover sheet PDF generated. I now need a way to append the additional PDF's as above. How can we achieve this using ITextSharp?
If you are trying to concatenate multiple PDF files into one you may take a look at the following post.
I found a simple way to do this, I found something called PDFCopy in ITextSharp
void MergePdfStreams(List<Stream> Source, Stream Dest)
{
var copy = new PdfCopyFields(Dest);
foreach (Stream source in Source)
{
var reader = new PdfReader(source);
copy.AddDocument(reader);
}
copy.Close();
}
Source : Is there a straight forward way to append one PDF doc to another using iTextSharp?