How to extract rotated images from PDF with iText

How to extract rotated images from PDF with iText - c#

I need to extract images from PDF.
I know that some images are rotated 90 degrees (I checked with online tools).
I'm using this code:
PdfRenderListener:
public class PdfRenderListener : IExtRenderListener
{
// other methods ...
public void RenderImage(ImageRenderInfo renderInfo)
{
try
{
var mtx = renderInfo.GetImageCTM();
var image = renderInfo.GetImage();
var fillColor = renderInfo.GetCurrentFillColor();
var color = Color.FromArgb(fillColor?.RGB ?? Color.Empty.ToArgb());
var fileType = image.GetFileType();
var extension = "." + fileType;
var bytes = image.GetImageAsBytes();
var height = mtx[Matrix.I22];
var width = mtx[Matrix.I11];
// rotated image
if (height == 0 && width == 0)
{
var h = Math.Abs(mtx[Matrix.I12]);
var w = Math.Abs(mtx[Matrix.I21]);
}
// save image
}
catch (Exception e)
{
Console.WriteLine(e);
}
}
}
When I save images with this code the rotated images are saved with distortion.
I have read this post iText 7 ImageRenderInfo Matrix contains negative height on Even number Pages and mkl answer.
In current transfromation matrix (mtx) I have these values:
0
841.9
0
-595.1
0
0
595.1
0
1
I know image rotated 90 degrees. How can I transform an image to get a normal image?

As #mkl mentioned, the true reason was not in the rotation of the image, but with the applied filter.
I analyzed the pdf file with iText RUPS and found that the image was encoded with a CCITTFaxDecode filter:
RUPS screen
Next, I looked for ways to decode this filter and found these questions
Extracting image from PDF with /CCITTFaxDecode filter.
How to use Bit Miracle LibTiff.Net to write the image to a MemoryStream
I used the BitMiracle.LibTiff.NET library
I wrote this method:
private byte[] DecodeInternal(byte[] rawBytes, int width, int height, int k, int bitsPerComponent)
{
var compression = GetCompression(k);
using var ms = new MemoryStream();
var tms = new TiffStream();
using var tiff = Tiff.ClientOpen("in-memory", "w", ms, tms);
tiff.SetField(TiffTag.IMAGEWIDTH, width);
tiff.SetField(TiffTag.IMAGELENGTH, height);
tiff.SetField(TiffTag.COMPRESSION, compression);
tiff.SetField(TiffTag.BITSPERSAMPLE, bitsPerComponent);
tiff.SetField(TiffTag.SAMPLESPERPIXEL, 1);
var writeResult = tiff.WriteRawStrip(0, rawBytes, rawBytes.Length);
if (writeResult == -1)
{
Console.WriteLine("Decoding error");
}
tiff.CheckpointDirectory();
var decodedBytes = ms.ToArray();
tiff.Close();
return decodedBytes;
}
private Compression GetCompression(int k)
{
return k switch
{
< 0 => Compression.CCITTFAX4,
0 => Compression.CCITTFAX3,
_ => throw new NotImplementedException("K > 0"),
};
}
After decoding and rotating the image, I was able to save a normal image. Thanks everyone for the help.

You can try this. I'm using Itext 7 for java. Here you still need to write your own listener:
public class MyImageRenderListener implements IEventListener {
protected String path;
protected String extension;
public MyImageRenderListener (String path) {
this.path = path;
}
public void eventOccurred(IEventData data, EventType type) {
switch (type) {
case RENDER_IMAGE:
try {
String filename;
FileOutputStream os;
ImageRenderInfo renderInfo = (ImageRenderInfo) data;
PdfImageXObject image = renderInfo.getImage();
if (image == null) {
return;
}
byte[] imageByte = image.getImageBytes(true);
extension = image.identifyImageFileExtension();
filename = String.format(path, image.getPdfObject().getIndirectReference().getObjNumber(), extension);
os = new FileOutputStream(filename);
os.write(imageByte);
os.flush();
os.close();
} catch (com.itextpdf.io.exceptions.IOException | IOException e) {
System.out.println(e.getMessage());
}
break;
default:
break;
}
}
public Set<EventType> getSupportedEvents() {
return null;
}
}
I checked for a pdf with a random rotation angle, and 90 degrees, the resulting picture was obtained without distortion
public void manipulatePdf() throws IOException, SQLException, ParserConfigurationException, SAXException {
PdfDocument pdfDoc = new PdfDocument(new PdfReader("path to pdf"), new PdfWriter(new ByteArrayOutputStream()));
MyImageRenderListener listener = new MyImageRenderListener("path to resulting image");
PdfCanvasProcessor parser = new PdfCanvasProcessor(listener);
for (int i = 1; i <= pdfDoc.getNumberOfPages(); i++) {
parser.processPageContent(pdfDoc.getPage(i));
}
pdfDoc.close();
}

Related

Reduce PDF size using iTetstSharp in .Net

I am converting number of Jpeg or Png image to PDF using iTextSharp dll. I can convert but the size of the PDF cause much worry. If I convert 9 jpeg images (total size is 4.5 MB) into single pdf , it creates 12.3 MB size of PDF. Below is conversion part.
private bool CreatePdf(string stFilePath_in, List<ImageData> lstImageData_in, string doctype, string stproCompid)
{
bool flag = false;
StringBuilder builder = new StringBuilder();
try
{
this.UtilityProgress(lstImageData_in.Count);
builder.Append(stFilePath_in);
builder.Append(#"\");
builder.Append(lstImageData_in[0].Barcode);
builder.Append(".pdf");
Document document = new Document(PageSize.LETTER, 10f, 10f, 42f, 35f);
PdfWriter.GetInstance(document, new FileStream(builder.ToString(), FileMode.OpenOrCreate));
document.Open();
IOrderedEnumerable<ImageData> enumerable = from files in lstImageData_in
orderby files.PageNo
select files;
if (enumerable != null)
{
DbFileData data2;
foreach (ImageData data in enumerable)
{
Bitmap bitmap = new Bitmap(data.FilePath);
iTextSharp.text.Image instance = iTextSharp.text.Image.GetInstance(bitmap, ImageFormat.Png);
if (instance.Height > instance.Width)
{
float num = 0f;
num = 700f / instance.Height;
instance.ScalePercent(num * 100f);
}
else
{
float num2 = 0f;
num2 = 540f / instance.Width;
instance.ScalePercent(num2 * 100f);
}
instance.Border = 15;
instance.BorderColor = BaseColor.BLACK;
instance.BorderWidth = 3f;
document.Add(instance);
document.NewPage();
bitmap.Dispose();
}
document.Close();
if (doctype == "AR")
{
//data2.m_stInvoiceNo = lstImageData_in[0].Barcode.Substring(2);
data2.m_stInvoiceNo = lstImageData_in[0].Barcode.ToString();
data2.m_doctype = "AR";
}
else
{
data2.m_stInvoiceNo = lstImageData_in[0].Barcode.ToString();
data2.m_doctype = "PO";
}
data2.m_stImgLocation = builder.ToString();
string str = DateTime.Now.ToString("MM/dd/yy,hh:mm:ss");
data2.m_dtDate = DateTime.Now.Date;
data2.m_stTime = str.Substring(str.IndexOf(",") + 1);
data2.m_stcompid = stproCompid;
this.OnPdfFileCreationCompleted(data2);
return true;
}
flag = false;
}
catch (Exception exception)
{
flag = false;
StringBuilder builder2 = new StringBuilder();
builder2.Append(builder.ToString());
builder2.Append(": \t");
builder2.Append(exception.Message);
this.m_excepLogger.LogException(builder2.ToString());
}
return flag;
}

The OP creates the iTextSharp Image object like this:
Bitmap bitmap = new Bitmap(data.FilePath);
iTextSharp.text.Image instance = iTextSharp.text.Image.GetInstance(bitmap, ImageFormat.Png);
What this actually means is that the original image file is decoded into a bitmap, and then iTextSharp is asked to use the bitmap as if it was a PNG image.
In case of JPG images this usually means that the amount of data required to store the image explodes.
To prevent such size explosions one should allow iTextSharp to directly work with the original image file data, in the context at hand:
iTextSharp.text.Image instance = iTextSharp.text.Image.GetInstance(data.FilePath);

iTextSharp: Extracted CMYK Image is inverted

I have been using itextsharp to extract text from a PDF and I wanted to extract all the images, which may be grayscaled JPGs, CMYK JPGs, and PNGs (according to the itextsharp filter). When I extract the images, gray JPGs and PNGs show correctly. However, CMYK JPGs appear weird, like their colors were inverted after extraction. Here is my code:
internal class ImageRenderListener : IRenderListener
{
#region Fields
List<System.Drawing.Image> images = new List<System.Drawing.Image>();
#endregion Fields
#region Properties
public List<System.Drawing.Image> Images
{
get { return images; }
}
#endregion Properties
#region Methods
#region Public Methods
public void BeginTextBlock() { }
public void EndTextBlock() { }
public void RenderImage(ImageRenderInfo renderInfo)
{
PdfImageObject image = renderInfo.GetImage();
PdfName filter = (PdfName)image.Get(PdfName.FILTER);
//int width = Convert.ToInt32(image.Get(PdfName.WIDTH).ToString());
//int bitsPerComponent = Convert.ToInt32(image.Get(PdfName.BITSPERCOMPONENT).ToString());
//string subtype = image.Get(PdfName.SUBTYPE).ToString();
//int height = Convert.ToInt32(image.Get(PdfName.HEIGHT).ToString());
//int length = Convert.ToInt32(image.Get(PdfName.LENGTH).ToString());
//string colorSpace = image.Get(PdfName.COLORSPACE).ToString();
/* It appears to be safe to assume that when filter == null, PdfImageObject
* does not know how to decode the image to a System.Drawing.Image.
*
* Uncomment the code above to verify, but when I've seen this happen,
* width, height and bits per component all equal zero as well. */
if (filter != null)
{
System.Drawing.Image drawingImage = image.GetDrawingImage();
var DimParams = renderInfo.GetImageCTM();
string extension = string.Empty;
if (filter == PdfName.DCTDECODE)
{
extension += PdfImageObject.ImageBytesType.JPG.FileExtension;
}
else if (filter == PdfName.JPXDECODE)
{
extension += PdfImageObject.ImageBytesType.JP2.FileExtension;
}
else if (filter == PdfName.FLATEDECODE)
{
extension += PdfImageObject.ImageBytesType.PNG.FileExtension;
}
else if (filter == PdfName.LZWDECODE)
{
extension += PdfImageObject.ImageBytesType.CCITT.FileExtension;
}
/* Rather than struggle with the image stream and try to figure out how to handle
* BitMapData scan lines in various formats (like virtually every sample I've found
* online), use the PdfImageObject.GetDrawingImage() method, which does the work for us. */
Images.Add(drawingImage );
}
}
public void RenderText(TextRenderInfo renderInfo) { }
#endregion Public Methods
#endregion Methods
}
I think that somehow the problem should be in the way the stream is decoded, but I do not understand how to fix this.
Thanks to any who has an idea.
Here is the extracted image (updated):
and here is a link to the PDF
http://docdro.id/ZoHmiAd

How to reduce image file, resize without losing quality using MultipleFileUpload in ASP.NET C#

I am uploading multiple photos using MultipleFileUpload, if I will upload big size images, then in slider image is not fixed size not showing proper looks. Is there any code for while uploading time restricts the size of images of the gallery.
Below is my c# code:
protected void lnkbtn_Submit_Click(object sender, EventArgs e)
{
try
{
if (MultipleFileUpload.HasFiles)
{
int MaxGalleryId, ReturnValue;
ReturnValue = obj.fnCreateNewPhotoGallery(txtGalleryName.Text, txtGalleryDescrption.Text, DateTime.Now, out MaxGalleryId);
if (ReturnValue != 0)
{
string GalleryPath = System.Configuration.ConfigurationManager.AppSettings["GalleryPath"] + MaxGalleryId;
Directory.CreateDirectory(Server.MapPath(GalleryPath));
string ThumbnailPath = System.Configuration.ConfigurationManager.AppSettings["ThumbnailPath"] + MaxGalleryId;
Directory.CreateDirectory(Server.MapPath(ThumbnailPath));
StringBuilder UploadedFileNames = new StringBuilder();
foreach (HttpPostedFile uploadedFile in MultipleFileUpload.PostedFiles)
{
//Upload file
string FileName = HttpUtility.HtmlEncode(Path.GetFileName(uploadedFile.FileName));
string SaveAsImage = System.IO.Path.Combine(Server.MapPath(GalleryPath + "/"), FileName);
uploadedFile.SaveAs(SaveAsImage);
//Create thumbnail for uploaded file and save thumbnail on disk
Bitmap Thumbnail = CreateThumbnail(SaveAsImage, 200, 200);
string SaveAsThumbnail = System.IO.Path.Combine(Server.MapPath(ThumbnailPath + "/"), FileName);
Thumbnail.Save(SaveAsThumbnail);
}
HTMLHelper.jsAlertAndRedirect(this, "Gallery created successfully. ", "Album.aspx?GalleryId=" + MaxGalleryId);
}
}
}
catch
{
HTMLHelper.jsAlertAndRedirect(this, "Gallery is not created. Some exception occured ", "CreateAlbum.aspx");
}
}
Below is my Create Thumbnail method code :
public Bitmap CreateThumbnail(string ImagePath, int ThumbnailWidth, int ThumbnailHeight)
{
System.Drawing.Bitmap Thumbnail = null;
try
{
Bitmap ImageBMP = new Bitmap(ImagePath);
ImageFormat loFormat = ImageBMP.RawFormat;
decimal lengthRatio;
int ThumbnailNewWidth = 0;
int ThumbnailNewHeight = 0;
decimal ThumbnailRatioWidth;
decimal ThumbnailRatioHeight;
// If the uploaded image is smaller than a thumbnail size the just return it
if (ImageBMP.Width <= ThumbnailWidth && ImageBMP.Height <= ThumbnailHeight)
return ImageBMP;
// Compute best ratio to scale entire image based on larger dimension.
if (ImageBMP.Width > ImageBMP.Height)
{
ThumbnailRatioWidth = (decimal)ThumbnailWidth / ImageBMP.Width;
ThumbnailRatioHeight = (decimal)ThumbnailHeight / ImageBMP.Height;
lengthRatio = Math.Min(ThumbnailRatioWidth, ThumbnailRatioHeight);
ThumbnailNewWidth = ThumbnailWidth;
decimal lengthTemp = ImageBMP.Height * lengthRatio;
ThumbnailNewHeight = (int)lengthTemp;
}
else
{
ThumbnailRatioWidth = (decimal)ThumbnailWidth / ImageBMP.Width;
ThumbnailRatioHeight = (decimal)ThumbnailHeight / ImageBMP.Height;
lengthRatio = Math.Min(ThumbnailRatioWidth, ThumbnailRatioHeight);
ThumbnailNewHeight = ThumbnailHeight;
decimal lengthTemp = ImageBMP.Width * lengthRatio;
ThumbnailNewWidth = (int)lengthTemp;
}
Thumbnail = new Bitmap(ThumbnailNewWidth, ThumbnailNewHeight);
Graphics g = Graphics.FromImage(Thumbnail);
g.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic;
g.FillRectangle(Brushes.White, 0, 0, ThumbnailNewWidth, ThumbnailNewHeight);
g.DrawImage(ImageBMP, 0, 0, ThumbnailNewWidth, ThumbnailNewHeight);
ImageBMP.Dispose();
}
catch
{
return null;
}
return Thumbnail;
}
The above code there is a command line //Upload file from there uploading images. I used this example for the gallery:
http://www.bugdebugzone.com/2013/10/create-dynamic-image-gallery-slideshow.html

You can the ContentLength property of uploadedFile as such:
if (uploadedFile.ContentLength > 1000000)
{
continue;
}
ContentLength is the size in bytes of the uploaded file.
https://msdn.microsoft.com/en-us/library/system.web.httppostedfile.contentlength(v=vs.110).aspx

Printing an array of images

I inherited an app that needs to print an array of images which are stored in a database. I have written the code to retrieve the images via a web service, and have put the images into an array of BitmapImage.
However when I add the BitmapImage to the source of the Image, the image is then empty. No exceptions are thrown, but it appears that no images are added to the list, and the code appears to simply halt as soon as I add the image to the array.
Any hints would be appreciated.
In header of class:
\\BitmapImage is System.Windows.Media.Imaging.BitmapImage
public List<BitmapImage> prtImages;
Event handler for web service on completion of getting images asynchronously:
void client_GetImagesCompleted(object sender, OrderImageWebService.GetOrderImagesCompletedEventArgs e)
{
if (e.Error != null)
{
System.Windows.Browser.HtmlPage.Window.Eval("alert('Error occurred loading images.');");
throw new Exception(string.Format("Error loading image edits: {0}", e.Error));
}
else if ((!e.Cancelled) && (e.Result != null))
{
try
{
for (int i = 0; i <= imgs; i++)
{
byte[] bImage = loi[i].Content;
using (System.IO.MemoryStream sioms =
new System.IO.MemoryStream(bImage, 0, bImage.Length))
{
bi.SetSource(sioms);
int ph = bi.PixelHeight; //Alert shows 3296
int pw = bi.PixelWidth; //Alert shows 2560
System.Windows.Browser.HtmlPage.Window.Eval(
string.Format("alert('BI image is {0} by {1}.');",
ph.ToString(), pw.ToString()));
prtImages.Add(bi);
//This alert never shows
System.Windows.Browser.HtmlPage.Window.Eval(
string.Format("alert('prtImages has size {1} after add.');",
prtImages.Count.ToString()));
}
}
}
catch(Exception ex)
{
//This exception is never caught
System.Windows.Browser.HtmlPage.Window.Eval(string.Format("alert('{0}');", ex.Message));
}
}
else
{
System.Windows.Browser.HtmlPage.Window.Eval("alert('Oops. Error.');");
}
System.Windows.Browser.HtmlPage.Window.Eval("alert('Clearing wait.');");
bPrtImageLoaded = true;
ewh_gotImages.Set();
}
In GetPrintImage:
public Image GetPrintImage(int iPageID)
{
if (bPrtImageLoaded)
{
Image img = new Image();
img.Source = prtImages[iPageID];
double ah = img.ActualHeight;
double aw = img.ActualWidth;
System.Windows.Browser.HtmlPage.Window.Eval(
string.Format("alert('Image is {0} by {1}.');",
ah.ToString(), aw.ToString()));
if (null != img)
{
return (img);
}
}
return(null);
}
Called from button click
private void p_printMultiplePages(object sender, System.Windows.Printing.PrintPageEventArgs e)
{
Image img = GetPrintImage(iCurPrintPage);
if (null != img)
{
if (e.PrintableArea.Height < img.ActualHeight)
{
scale = e.PrintableArea.Height / img.ActualHeight;
}
if ((e.PrintableArea.Width < img.ActualWidth) && ((e.PrintableArea.Width / img.ActualWidth) < scale))
{
scale = e.PrintableArea.Width / img.ActualWidth;
}
if (scale < 1)
{
scaleTransform.ScaleX = scale;
scaleTransform.ScaleY = scale;
img.RenderTransform = scaleTransform;
}
e.PageVisual = img;
iCurPrintPage++;
if (iCurPrintPage < SessionState.PageCount)
{
e.HasMorePages = true;
}
else
{
e.HasMorePages = false;
}
}
else
{
System.Windows.Browser.HtmlPage.Window.Eval(
string.Format("alert('Image {0} of order {1} is null.');",
iCurPrintPage.ToString(), SessionState.OrderID.ToString()));
e.HasMorePages = false;
}
}
Thanks for any hints you might have,
Bruce.

I would suggest getting images from web service as binary data (byte[] ) encoded each image as base64 and publish array of base64 strings. On the client you will decode each element of array with base64 stings back to byte[]. Read it with MemoryStream and pass that object to your Image reader (read this one reading image from bytes)
But before you do anything of that, make sure that your p_printMultiplePages method can display local images

Is it possible to use a Bitmap generated by code in an asp image tag?

I have written some code to create dynamic banners. It returns a bitmap variable. Is there some way that I can use this variable as the ImageUrl for an <asp:Image /> ?
Here is the code that creates the image:
public class SideImage
{
protected const int ImgCt = 4;
protected const int ImgW = 130;
protected const int ImgH = 150;
public Bitmap GenerateImage()
{
string serializedImage = CreateImage("side");
if(!string.IsNullOrEmpty(serializedImage))
{
using(MemoryStream ms = new MemoryStream(Convert.FromBase64String(serializedImage)))
{
Bitmap bitmap = new Bitmap(ms);
return bitmap;
}
}
return null;
}
protected string CreateImage(string path)
{
try
{
using (Bitmap canvas = new Bitmap(ImgW, (ImgCt * ImgH)))
{
using (Graphics canvasGraphic = Graphics.FromImage(canvas))
{
List<FileInfo> fileList = new List<FileInfo>();
DirectoryInfo directoryInfo = new DirectoryInfo(HttpContext.Current.Server.MapPath(path + "/"));
fileList.AddRange(directoryInfo.GetFiles("*.jpg"));
Randomizer<FileInfo> randomizer = new Randomizer<FileInfo>();
fileList.Sort(randomizer);
int yOffset = 0;
for (int i = 0; i <= fileList.Count - 1; i++)
{
using (Image image = Image.FromFile(fileList[i].FullName))
{
Rectangle rectangle = new Rectangle(0, yOffset, ImgW, ImgH);
canvasGraphic.DrawImage(image, rectangle);
}
yOffset += ImgH;
}
ImageCodecInfo[] imageCodecInfo = ImageCodecInfo.GetImageEncoders();
using (EncoderParameters encoderParameters = new EncoderParameters(2))
{
using (MemoryStream memoryStream = new MemoryStream())
{
encoderParameters.Param[0] = new EncoderParameter(Encoder.Quality, 100L);
encoderParameters.Param[1] = new EncoderParameter(Encoder.ColorDepth, 16L);
canvas.Save(memoryStream, imageCodecInfo[1], encoderParameters);
return Convert.ToBase64String(memoryStream.GetBuffer());
}
}
}
}
}
catch (Exception ex)
{
return null;
}
}
}
public class Randomizer<T> : IComparer<T>
{
protected Random Salter;
protected int Salt;
protected SHA1 Sha1;
public Randomizer()
{
Salter = new Random();
Salt = Salter.Next();
Sha1 = new SHA1Managed();
}
public Randomizer(int seed)
{
Salter = new Random(seed);
Salt = Salter.Next();
Sha1 = new SHA1Managed();
}
private int HashAndSalt(int value)
{
byte[] b = Sha1.ComputeHash(BitConverter.GetBytes(value));
int r = 0;
for (int i = 0; i < (b.Length - 1); i += 4)
{
r = r ^ BitConverter.ToInt32(b, i);
}
return r ^ Salt;
}
public int Compare(T x, T y)
{
return HashAndSalt(x.GetHashCode().CompareTo(HashAndSalt(y.GetHashCode())));
}
}

See the content of this question for couple of different approaches to this sort thing.
(Obviously I prefer mine ;) ).

This sounds similar to how chart generators work. There are two approaches to this that I've seen. One is to create a file on the server and then point to that. The other is to store the bitmap in memory and then call an aspx page in place of the image. The ASP page would read the bitmap from memory and return to the browser.

I'd create an HTTP module to do this.
You could setup the HTTP module to intercept all requests to a particular folder ('/images/generated/') for example.
Then, when you get a request for an image in this folder, the code in your HTTP module will be called. Create your image in memory and write it out to the Response object (having set any appropriate MIME type headers and things first).
In your HTML you can then write an image tag like <img src="/images/generated/image-doesnt-physically-exist.jpg" /> and still get an image back.
Hope that helps point you in the right direction!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to extract rotated images from PDF with iText - c#

Related

Reduce PDF size using iTetstSharp in .Net

iTextSharp: Extracted CMYK Image is inverted

How to reduce image file, resize without losing quality using MultipleFileUpload in ASP.NET C#

Printing an array of images

Is it possible to use a Bitmap generated by code in an asp image tag?

Categories

Resources