Convert Base64 from PDF to Bitmap [duplicate] - c#

Is there any way, I can convert HTML Document (file not URL) to Image, or PDF to image?
I am able to do the above using Ghostscript DLL , Is there any other way , I can do it, without using the Ghostscript DLL?
I am developing a C# Windows Application.

the best and free nuget package that you can save every page of your Pdf to png and with custom resilution Docnet.core this can be use in the .net core project.
they have github and nice examples but here i want to add my code for reading en pdf with more that one page
string webRootPath = _hostingEnvironment.WebRootPath;
string fullPath = webRootPath + "/uploads/user-manual/file.pdf";
string fullPaths = webRootPath + "/uploads/user-manual";
using (var library = DocLib.Instance)
{
using (var docReader = library.GetDocReader(fullPath, 1080, 1920))
{
for (int i = 1; i < docReader.GetPageCount(); i++)
{
using (var pageReader = docReader.GetPageReader(i))
{
var bytes = EmailTemplates.GetModifiedImage(pageReader);
System.IO.File.WriteAllBytes(fullPaths+"/page_image_" +i+".png", bytes);
}
}
}
}
Other functions you can find in thier github repo.

Use LibPdf, for PDF to Image conversion
LibPdf library converts converts PDF file to an image. Supported image formats are PNG and BMP, but you can easily add more.
Usage example:
using (FileStream file = File.OpenRead(#"..\path\to\pdf\file.pdf")) // in file
{
var bytes = new byte[file.Length];
file.Read(bytes, 0, bytes.Length);
using (var pdf = new LibPdf(bytes))
{
byte[] pngBytes = pdf.GetImage(0,ImageType.PNG); // image type
using (var outFile = File.Create(#"..\path\to\pdf\file.png")) // out file
{
outFile.Write(pngBytes, 0, pngBytes.Length);
}
}
}
ImageMagick, you should also look at this freely available and powerful tool. It's capable of doing what you want and also provides some .NET bindings (as well as bindings to several other languages).
In its simplest form, it's just like writing a command
convert file.pdf imagefile.png

Try Freeware.Pdf2Png, check below url:
PDF to PNG converter.
byte[] png = Freeware.Pdf2Png.Convert(pdf, 1);
https://www.nuget.org/packages/Freeware.Pdf2Png/1.0.1?_src=template
In the about info, It said MIT license, I check it on March 22, 2022.
But as said Mitya, please double check.

You can use below any one library for PDF to Image conversion
Use Aspose.pdf link below:
http://www.aspose.com/docs/display/pdfnet/Convert+all+PDF+pages+to+JPEG+Images
code sample:
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(MyPdfPath));
using (FileStream imageStream = new FileStream(MyOutputImage.png, FileMode.Create))
{
Resolution resolution = new Resolution(300);
PngDevice pngDevice = new PngDevice(resolution);
pngDevice.Process(pdfDocument.Pages[PageNo], MyOutputImage);
imageStream.Close();
}
Use Bytescout PDF Renderer link below:
http://bytescout.com/products/developer/pdfrenderersdk/convert-pdf-to-png-basic-examples
code sample :
MemoryStream ImageStream = new MemoryStream();
RasterRenderer renderer = new RasterRenderer();
renderer.RegistrationName = "demo";
renderer.RegistrationKey = "demo";
// Load PDF document.
renderer.LoadDocumentFromFile(FilePath);
for (int i = 0; i < renderer.GetPageCount(); i++)
{
// Render first page of the document to PNG image file.
renderer.RenderPageToStream(i, RasterOutputFormat.PNG, ImageStream);
}
Image im = Image.FromStream(ImageStream);
im.Save("MyOutputImage.png");
ImageStream.Close();

Using docnet, based in this example on github, I did this, very simple and functional :
pdf used in this example.
//...
using Docnet.Core;
using System.IO;
using Docnet.Core.Models;
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;
//paths
string pathPdf = #"C:\pathToPdfFile\lorem-ipsum.pdf";
string finalPathWithFileName = #"C:\pathToFinalImageFile\finalFile.png";
//using docnet
using (var docReader = DocLib.Instance.GetDocReader(pathPdf, new PageDimensions(1080, 1920)))
{
//open pdf file
using (var pageReader = docReader.GetPageReader(0))
{
var rawBytes = pageReader.GetImage();
var width = pageReader.GetPageWidth();
var height = pageReader.GetPageHeight();
var characters = pageReader.GetCharacters();
//using bitmap to create a png image
using (var bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb))
{
AddBytes(bmp, rawBytes);
using (var stream = new MemoryStream())
{
//saving and exporting
bmp.Save(stream, ImageFormat.Png);
File.WriteAllBytes(finalPathWithFileName, stream.ToArray());
};
};
};
};
//extra methods
private static void AddBytes(Bitmap bmp, byte[] rawBytes)
{
var rect = new Rectangle(0, 0, bmp.Width, bmp.Height);
var bmpData = bmp.LockBits(rect, ImageLockMode.WriteOnly, bmp.PixelFormat);
var pNative = bmpData.Scan0;
Marshal.Copy(rawBytes, 0, pNative, rawBytes.Length);
bmp.UnlockBits(bmpData);
}

Spire.PDF library can be used for PDF to Image conversion, such as PDF to PNG, JPG, EMF and TIFF etc.
The following is the code example shows how to convert PDF to PNG:
//Load a PDF
PdfDocument doc = new PdfDocument();
doc.LoadFromFile("PdfFilePath");
//Save to PNG images
for (int i = 0; i < doc.Pages.Count; i++)
{
String fileName = String.Format("ToImage-img-{0}.png", i);
using (Image image = doc.SaveAsImage(i,300,300))
{
image.Save(fileName, System.Drawing.Imaging.ImageFormat.Png);
}
}
doc.Close();
More conversion examples can be found in the library's documentation. It also provides a free community edition but with some limitations.

While using Ghostscript with ImageMagick is a potential option, it is incredibly slow, every page would take around 5 or more seconds. DocNet is a much better option to convert pdf to images. The following code would convert all pages in a pdf file into Images, and do that fast.
public void SavePDFtoJPGDocnet(string fileName)
{
string FilePath = #"C:\SampleFileFolder\doc.pdf";
string DestinationFolder = #"C:\SampleFileFolder\";
IDocLib DocNet = DocLib.Instance;
//you are specifying the max resolution of image on any side, actual resolution will be limited by longer side,
//preserving the aspect ratio
var docReader = DocNet.GetDocReader(
FilePath,
new PageDimensions(1440, 2560));
for (int i = 0; i < docReader.GetPageCount(); i++)
{
using (var pageReader = docReader.GetPageReader(i))
{
var rawBytes = pageReader.GetImage();
var width = pageReader.GetPageWidth();
var height = pageReader.GetPageHeight();
var characters = pageReader.GetCharacters();
var bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb);
DocnetClass.AddBytes(bmp, rawBytes);
//DocnetClass.DrawRectangles(bmp, characters);
var stream = new MemoryStream();
bmp.Save(stream, ImageFormat.Png);
File.WriteAllBytes(DestinationFolder + "/page_image_" + i + ".png", stream.ToArray());
}
}
}

Freeware.Pdf2Png worked great for my needs.
It does not only convert to Png, you can save to the image format of your choice.
In MS Visual Studio run this in your Package Manager console
PM> NuGet\Install-Package Freeware.Pdf2Png -Version 1.0.1,
or just add via the NuGet Package Manager GUI, search for Freeware.Pdf2Png and it should come up.
Once the reference is added to your project, code similar to this should do what you need to convert a PDF to an Image.
using (FileStream fs = new FileStream(FullFilePath, FileMode.Open))
{
byte[] buff = Freeware.Pdf2Png.Convert(fs, 1);
MemoryStream ms = new MemoryStream(buff);
Image img = Image.FromStream(ms);
img.Save(TiffFilePath, System.Drawing.Imaging.ImageFormat.Tiff);
}
FullFilePath - a string that is the Full File Path to the PDF to be converted.
TiffFilePath - a string that is the Full File Path of the newly created Image that you would like to save.
Unfortunately I was not able to find any c# code or proper algorithm to do this conversion without a 3rd party DLL. If any of you have good information for that please do share it!

In case someone wants to use Ghostscript.NET.
Ghostscript.NET - (written in C#) is the most completed managed wrapper library around the Ghostscript library (32-bit & 64-bit), an interpreter for the PostScript language, PDF.
It is dependent on executable file you have to install on your machine. Here is a link from where you can see and download the latest version of the exe.
https://www.ghostscript.com/download/gsdnld.html
P.S. I had some troubles with the latest version 9.50 not being able to count the pages.
I prefer using the 9.26 version.
https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs926/gs926aw32.exe
https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs926/gs926aw64.exe
Next step is to find and install Ghostscript.NET from Nuget.
I download the PDF from CDN url and use the MemoryStream to open and process the PDF file. Here is a sample code:
using (WebClient myWebClient = new WebClient())
{
using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer())
{
/* custom switches can be added before the file is opened
rasterizer.CustomSwitches.Add("-dPrinted");
*/
byte[] buffer = myWebClient.DownloadData(pdfUrl);
using (var ms = new MemoryStream(buffer))
{
rasterizer.Open(ms);
var image = rasterizer.GetPage(0, 0, 1);
var imageURL = "MyCDNpath/Images/" + filename + ".png";
_ = UploadFileToS3(image, imageURL);
}
}
}
You can also use it with temporary FileStream. Here is another example. Note that the File is temporary and has DeleteOnClose mark.
using (WebClient myWebClient = new WebClient())
{
using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer())
{
/* custom switches can be added before the file is opened
rasterizer.CustomSwitches.Add("-dPrinted");
*/
byte[] buffer = myWebClient.DownloadData(pdfUrl);
int bufferSize = 4096;
using (var fileStream = System.IO.File.Create("TempPDFolder/" + pdfName, bufferSize, System.IO.FileOptions.DeleteOnClose))
{
// now use that fileStream to save the pdf stream
fileStream.Write(buffer, 0, buffer.Length);
rasterizer.Open(fileStream);
var image = rasterizer.GetPage(0, 0, 1);
var imageURL = "MyCDNpath/Images/" + filename + ".png";
_ = UploadFileToS3(image, imageURL);
}
}
}
Hope it will help someone struggling to get high quality images from pdf for free.

Related

Convert image byte array to PDF

I'm looking to convert an Image (PNG, JPEG, GIF), in the form of a byte[] to a PDF.
I'm currently using this function, which works, but cuts off the bottom of images that over a certain height or specific proportions; for example 500x2000.
Where am I going wrong here?
public byte[] ConvertImageToPDF(byte[] bytes)
{
byte[] pdfArray;
using (var memoryStream = new MemoryStream())
{
using (var pdfWriter = new PdfWriter(memoryStream))
{
var pdf = new PdfDocument(pdfWriter);
var document = new Document(pdf);
ImageData imageData = ImageDataFactory.Create(bytes);
document.Add(new Image(imageData));
document.Close();
}
pdfArray = memoryStream.ToArray();
}
return pdfArray;
}
I suppose what you want is the PdfWriter to auto-scale the Image inside the Document.
Optionally, position the Image in the center of the Page.
You can change your code setting [Image].SetAutoScale(true) and [Image].SetHorizontalAlignment(HorizontalAlignment.CENTER):
Note: I've defined aliases for iText.Layout.Properties (alias: PdfProperties) and iText.Layout.Element.Image (alias: PdfImage), to avoid conflict with other .Net assemblies that have classes and enumerators with the same exact names. Just remove them in case you don't need them at all.
using iText.IO.Image;
using iText.Kernel.Pdf;
using iText.Layout;
using PdfProperties = iText.Layout.Properties;
using PdfImage = iText.Layout.Element.Image;
public byte[] ConvertImageToPDF(byte[] imageBytes)
{
using (var ms = new MemoryStream()) {
using (var pdfWriter = new PdfWriter(ms)) {
var pdf = new PdfDocument(pdfWriter);
var document = new Document(pdf);
var img = new PdfImage(ImageDataFactory.Create(imageBytes))
.SetAutoScale(true)
.SetHorizontalAlignment(PdfProperties.HorizontalAlignment.CENTER);
document.Add(img);
document.Close();
pdf.Close();
return ms.ToArray();
}
}
}
You can also specify the size, in floating point units, of the Image and use the [Image].ScaleToFit() method, to scale the Image within those bounds.
Here, using a PageSize set to PageSize.A4. You can of course set different measures.
using iText.Kernel.Geom;
// [...]
var document = new Document(pdf);
var page = document.GetPageEffectiveArea(PageSize.A4);
var img = new PdfImage(ImageDataFactory.Create(imageBytes))
.ScaleToFit(page.GetWidth(), page.GetHeight())
.SetHorizontalAlignment(PdfProperties.HorizontalAlignment.CENTER);
// [...]

How to use Magick.net to convert HttpPostedFile .heic to .jpg

I have a .NET c# web application that allows users to upload images. I installed the Magick.NET NuGet package. I'm attempting to convert uploaded .heic files to .jpg but I'm not sure where to start.
Here's the upload without Magick.net.
string fullPath = path + #"\Images\" + fileName;
Image image = Image.FromStream(postedFile.InputStream);
image.Save(path);
Here's what I'm trying but I get the following error: Parameter is not valid
string fullPath = path + #"\Images\" + fileName;
Bitmap bitmap = new Bitmap(postedFile.InputStream); <-- ERROR
using (MagickImage i = new MagickImage(bitmap))
{
i.Format = MagickFormat.Jpg;
using (MemoryStream memStream = new MemoryStream(i.ToByteArray()))
{
image = Image.FromStream(memStream);
}
}
image.Save(path);
Here's another approach that's similar to the Magick.net documentation but I get the following error: Attempt by security transparent method to access security critical method failed. Assembly marked with the AllowPartiallyTrustedCallersAttribute and uses the level 2 security transparency model. Level 2 transparency causes all methods in AllowPartiallyTrustedCallers assemblies to become security transparent by default, which may be the cause of this exception.
string fullPath = path + #"\Images\" + fileName;
Stream fs = postedFile.InputStream;
BinaryReader br = new System.IO.BinaryReader(fs);
byte[] bytes = br.ReadBytes((Int32)fs.Length);
using (MemoryStream memStream = new MemoryStream())
{
using (MagickImage i = new MagickImage(bytes)) <-- ERROR
{
i.Format = MagickFormat.Jpg;
image = Image.FromStream(memStream);
}
}
image.Save(path);
Anyone have any suggestions?
You don't need to instantiate a bitmap for that, and that error is probably because of your stream position (you need to set it to 0).
Here's what I did to make this work
postedFile.InputStream.Position = 0; //just to avoid the Parameter is not valid error in case you did any previous manipulation with this
using (MagickImage i = new MagickImage(postedFile.InputStream))
{
i.Format = MagickFormat.Jpg;
using (MemoryStream memStream = new MemoryStream(i.ToByteArray()))
{
image = Image.FromStream(memStream);
}
}

How to produce a System.Drawing.Image from SVG on .net core?

I have found many libraries to read SVG and transform it to System.Drawing.Image or png in C# framework, but I cannot find any way to do it in .net core.
And if I use Image.FromFile, I get an OutOfMemoryException (supposedly because SVG is not a rasterized format).
Any tips on how to use Image to read SVG or any open source library that works in .net core?
Skiasharp by Xamarin team seems to be a good choice. There's already a document of API on learn.microsoft.com. For more detailed information, see Mono/SkiaSharp and Mono/mono/SkiaSharp.Extended
You can install the offical svg extension on nuget by dotnet add package SkiaSharp.Svg:
<PackageReference Include="SkiaSharp.Svg" Version="1.60.0" />
Demo:
var svgSrc=Path.Combine(Directory.GetCurrentDirectory(),"img.svg");
string svgSaveAs = "xyz.png";
var quality = 100;
var svg = new SkiaSharp.Extended.Svg.SKSvg();
var pict = svg.Load(svgSrc);
var dimen = new SkiaSharp.SKSizeI(
(int) Math.Ceiling(pict.CullRect.Width),
(int) Math.Ceiling(pict.CullRect.Height)
);
var matrix = SKMatrix.MakeScale(1,1);
var img = SKImage.FromPicture(pict,dimen,matrix);
// convert to PNG
var skdata = img.Encode(SkiaSharp.SKEncodedImageFormat.Png,quality);
using(var stream = File.OpenWrite(svgSaveAs)){
skdata.SaveTo(stream);
}
Screenshot:
You can use ImageMagick to convert svg to any format.
<PackageReference Include="Magick.NET-Q16-AnyCPU" Version="7.14.0" />
Below method converts svg base64 string to other formats.
public static string Base64ToImageStream(string base64String)
{
byte[] imageBytes = Convert.FromBase64String(base64String);
using (MemoryStream ms = new MemoryStream(imageBytes, 0, imageBytes.Length))
{
using (var msOut = new MemoryStream())
{
MagickReadSettings readSettings = new MagickReadSettings()
{
Format = MagickFormat.Svg,
Width = 60,
Height = 40,
BackgroundColor = MagickColors.Transparent
};
using (MagickImage image = new MagickImage(imageBytes, readSettings))
{
image.Format = MagickFormat.Png; // Specify the format you need
image.Write(msOut);
byte[] data = image.ToByteArray();
return Convert.ToBase64String(data);
// In case if you want the output in stream
// byte[] imgByte = Convert.FromBase64String(pngBase64);
// var pngStream = new MemoryStream(imgByte, 0, imgByte.Length);
// return pngStream;
}
}
}
}

Determine whether an image is PNG despite .BMP extension

I want to check if images in a directory are of type png but with extension .bmp. The following determines whether it is a .bmp extension
string x = Path.GetExtension(file);
From this we establish that its extension is .bmp. Now the problem comes in checking if it is in a png format. I am stuck on this part.
The reason why I am doing this is because I want to have my images transparent and .bmp images don't work so well with that.
Thank you!
The above answer is incorrect, the code should be:
var header = new byte[4];
using (var fs = new FileStream(filename))
{
fs.Read(header, 0, 4);
}
var strHeader = Encoding.ASCII.GetString(header);
return strHeader.ToLower().EndsWith("png");
We can check the file extension with this
Byte[] imageBase64 = ....
var encodedFile = Encoding.ASCII.GetString(imageBase64);
return encodedFile.ToLower().StartsWith("?png", StringComparison.InvariantCultureIgnoreCase);
Here is another take that I personally like since you don't have to check against strings (using the System.Drawing Library).
using (var fs = new FileStream(filename))
{
var fsImage = System.Drawing.Image.FromStream(fs);
if (fsImage.RawFormat == System.Drawing.Imaging.ImageFormat.Jpeg)
{
// Do something with Jpegs
}
else if (fsImage.RawFormat == System.Drawing.Imaging.ImageFormat.Png)
{
// Do something with Pngs
}
}
Read the first 4 bytes of the file:
byte[] b = new byte[4];
using (var fs = new FileStream(filename))
{
fs.Read(b, 0, 4);
}
if (b.ToString().Contains("PNG"))
{
// this is a png file
}

Getting Image Details For Adding To DOCX

I am adding images to a DOCX files using the WordprocessingDocument method found here (Open XML) http://msdn.microsoft.com/en-us/library/bb497430.aspx.
I can add images, but the sizing is not correct.
MainDocumentPart mainPart = doc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Jpeg);
using (System.IO.FileStream stream = new System.IO.FileStream(fileName, System.IO.FileMode.Open, System.IO.FileAccess.Read))
{
imagePart.FeedData(stream);
}
return AddImageToBody(doc, mainPart.GetIdOfPart(imagePart), fileName);
private static Drawing AddImageToBody(WordprocessingDocument wordDoc, string relationshipId, string filename)
{
long imageWidthEMU = 1900000;
long imageHeightEMU = 350000;
double imageWidthInInches = imageWidthEMU / 914400.0;
double imageHeightInInches = imageHeightEMU / 914400.0;
new DW.Extent();
//Define the reference of the image.
var element =
new Drawing(
new DW.Inline(
new DW.Extent() { Cx = imageWidthEMU, Cy = imageHeightEMU },
As you can see, you specify the sizes (length + width) manually. I am unable to get them dynamically. How can you get the right correct image size to pass to this code?
Thanks.
Your problem solved here:
Inserting Image into DocX using OpenXML and setting the size
There is no way to solve it via instruction in the xml file. OOXML offers only <a:fill> and <a:tile> options.

Categories

Resources