Blurry image when converting DOC to PNG - c#

Ok I have a problem that just baffles me yet again. I have some code that turns a DOC file to a PNG file. When I do it on a localhost, the image is fine. When I take the same code and put it on live server, the image is extremely small (same size as the DOT file what I got the DOC file from, basically DOT gets filled out and turned into a DOC). Now... here's the crazy part. If I log into the hosting server as an admin and THEN go to the live website, the image is large and crisp, even if I go to the site from an iPhone. As soon as I log out of the hosting server and refresh the live page, image is tiny again. Here's the code I am using to convert DOC to PNG. On a side note, if I use method 2, I can make the image bigger and higher resolution, but fonts are out of place.
private void ConvertDocToPNG(string startupPath, string filename1)
{
var docPath = Path.Combine(startupPath, filename1);
Application app = new Application();
Microsoft.Office.Interop.Word.Document doc = new Microsoft.Office.Interop.Word.Document();
app.Visible = false;
doc = app.Documents.Open(docPath);
app.WindowState = Microsoft.Office.Interop.Word.WdWindowState.wdWindowStateMaximize;
app.ActiveWindow.ActivePane.View.Zoom.Percentage = 100;
doc.ShowGrammaticalErrors = false;
doc.ShowRevisions = false;
doc.ShowSpellingErrors = false;
//Opens the word document and fetch each page and converts to image
foreach (Microsoft.Office.Interop.Word.Window window in doc.Windows)
{
foreach (Microsoft.Office.Interop.Word.Pane pane in window.Panes)
{
for (var i = 1; i <= pane.Pages.Count; i++)
{
Microsoft.Office.Interop.Word.Page page = null;
bool populated = false;
while (!populated)
{
try
{
// This !##$ variable won't always be ready to spill its pages. If you step through
// the code, it will always work. If you just execute it, it will crash. So what
// I am doing is letting the code catch up a little by letting the thread sleep
// for a microsecond. The second time around, this variable should populate ok.
page = pane.Pages[i];
populated = true;
}
catch (COMException ex)
{
Thread.Sleep(1);
}
}
var bits = page.EnhMetaFileBits;
var target = Path.Combine(startupPath + "\\", string.Format("{1}_page_{0}", i, filename1.Split('.')[0]));
try
{
using (var ms = new MemoryStream((byte[])(bits)))
{
var image = System.Drawing.Image.FromStream(ms);
var pngTarget = Path.ChangeExtension(target, "png");
// Method 2
image.Save(pngTarget, System.Drawing.Imaging.ImageFormat.Png);
// Another way to save it using custom size
//float width = Convert.ToInt32(hfIdCardMaxWidth.Value);
//float height = Convert.ToInt32(hfIdCardMaxHeight.Value);
//float scale = Math.Min(width / image.Width, height / image.Height);
//int resizedWidth = (int)Math.Round(image.Width * scale);
//int resizedHeight = (int)Math.Round(image.Height * scale);
//Bitmap myBitmap = new Bitmap(image, new Size(resizedWidth, resizedHeight));
//myBitmap.Save(pngTarget, System.Drawing.Imaging.ImageFormat.Png);
}
}
catch (System.Exception ex)
{
doc.Close(true, Type.Missing, Type.Missing);
Marshal.ReleaseComObject(doc);
doc = null;
app.Quit(true, Type.Missing, Type.Missing);
Marshal.ReleaseComObject(app);
app = null;
throw ex;
}
}
}
}
doc.Close(true, Type.Missing, Type.Missing);
Marshal.ReleaseComObject(doc);
doc = null;
app.Quit(true, Type.Missing, Type.Missing);
Marshal.ReleaseComObject(app);
app = null;
}

Given that you are using the interop in an unattended fashion, all sorts of weird/unexpected things can happen. I will admit, I don't know why you are experiencing the symptoms you are given your test cases in different environments. I have a very strong feeling being unattended is the culprit. The interop runs in the user's login context, and if there is no user... well... yeah. So, how to get around this and still be unattended? The first that comes to mind is using the OpenXML SDK. This is the safe way of manipulating office documents in an unattended fashion. I use it for unattended report generation myself.
Assumptions:
Standard DOCX format
The doc contains words/pictures/styles/whatever. It's not just a sack of images (if it is, there are much easier ways to accomplish what you need)
The API:
http://www.microsoft.com/en-us/download/details.aspx?id=30425
But, you can't convert a doc to an image with OpenXML! I thought of a workaround, but this is NOT tested. The idea is to convert the doc to html, then render out the html and stuff it into an image.
Here is a way to convert your word doc to HTML using OpenXML:
The big set of power tools that can do all sorts of handy things:
http://powertools.codeplex.com/
The specific module that you will need: http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2014/01/30/transform-docx-to-html-css-with-high-fidelity-using-powertools-for-open-xml.aspx
Here is a handy library to render out the HTML and dump it into an image:
http://htmlrenderer.codeplex.com/

using (var ms = new MemoryStream((byte[])(bits)))
{
var emf = new Metafile(ms);
var scale = 400 / emf.HorizontalResolution;
var Width= emf.Width * scale;
var Height = emf.Height * scale;
System.Drawing.Bitmap b = new System.Drawing.Bitmap((Int32)Width, (Int32)Height);
var G = System.Drawing.Graphics.FromImage(b);
G.Clear(System.Drawing.Color.White);
G.DrawImage(emf, 0, 0, (float)Width, (float)Height);
b.Save(pngTarget, System.Drawing.Imaging.ImageFormat.Png);
}

Related

C# interop word, cut and past works on Office 2016 but not on Office 2019

I found similar questions but not exactly the same.
I have a word template which I fill by texts entered by users.
On the user interface, there is a text field and two signature fields (a 3rd party component which produces an image file).
If the text is not very long, it passes on two versions of word. But if the text is long and there is some enters, it doesn't work on Office 2019 and Office 365. On Office 2016, it works always.
To better explain,
I open the document :
Microsoft.Office.Interop.Word.Application app = null;
Microsoft.Office.Interop.Word.Document doc = null;
...
app = new Microsoft.Office.Interop.Word.Application();
doc = app.Documents.Open(tempPath);
app.Visible = false;
doc.Bookmarks["comment"].Select();
app.Selection.TypeText(orderComment); //Order comment is typed by the user
...
//this code saves the signature as a png image and it works in any case. The image exists in the folder before calling the rest of the code.
string clientSignaturePath = System.Configuration.ConfigurationManager.AppSettings["TempPath"] + Guid.NewGuid().ToString().Substring(0, 6) + ".png";
using (FileStream fs = new FileStream(clientSignaturePath, FileMode.Create))
{
using (BinaryWriter bw = new BinaryWriter(fs))
{
byte[] data = Convert.FromBase64String(model.ClientSignature);
bw.Write(data);
bw.Close();
}
fs.Close();
}
//If the orderComment is too long, it gives this error in this method when I call the line rng.Paste(); on Office 2019 and 365 but not on 2016.
error : this method or property is not available because the clipboard is empty or invalid
UserMethods.InsertImage(doc, clientSignaturePath, "client", 79, 175);
In the class UserMethods:
public static void InsertImage(Microsoft.Office.Interop.Word.Document doc, string imagePath, string type, float? imageHeight = null, float? imageWidth = null)
{
Range rng = null;
if (type == "tech")
rng = doc.Tables[7].Cell(1, 1).Range;
else if (type == "client")
rng = doc.Tables[7].Cell(1, 2).Range;
else
rng = doc.Tables[7].Cell(1, 3).Range;
Microsoft.Office.Interop.Word.InlineShape autoScaledInlineShape = rng.InlineShapes.AddPicture(imagePath);
float scaledWidth = imageWidth ?? autoScaledInlineShape.Width;
float scaledHeight = imageHeight ?? autoScaledInlineShape.Height;
autoScaledInlineShape.Delete();
// Create a new Shape and fill it with the picture
Microsoft.Office.Interop.Word.Shape newShape = doc.Shapes.AddShape(1, 0, 0, scaledWidth, scaledHeight);
newShape.Fill.UserPicture(imagePath);
// Convert the Shape to an InlineShape and optional disable Border
Microsoft.Office.Interop.Word.InlineShape finalInlineShape = newShape.ConvertToInlineShape();
//finalInlineShape.Line.Visible = Microsoft.Office.Core.MsoTriState.msoFalse;
// Cut the range of the InlineShape to clipboard
finalInlineShape.Range.Cut();
// And paste it to the target Range
rng.Paste();
}
My Version of Office which works in any case:
And the server's (Windows Server 2016) offic version which doesn't work in case of large text :
Thanks in advance.
The Cut method may cause security issues related to Clipboard access
try
rng.FormattedText = finalInlineShape.Range.FormattedText;
finalInlineShape.Delete();
and comment;
//finalInlineShape.Range.Cut();

Unable to merge 2 PDFs using MemoryStream

I have a c# class that takes an HTML and converts it to PDF using wkhtmltopdf.
As you will see below, I am generating 3 PDFs - Landscape, Portrait, and combined of the two.
The properties object contains the html as a string, and the argument for landscape/portrait.
System.IO.MemoryStream PDF = new WkHtmlToPdfConverter().GetPdfStream(properties);
System.IO.FileStream file = new System.IO.FileStream("abc_landscape.pdf", System.IO.FileMode.Create);
PDF.Position = 0;
properties.IsHorizontalOrientation = false;
System.IO.MemoryStream PDF_portrait = new WkHtmlToPdfConverter().GetPdfStream(properties);
System.IO.FileStream file_portrait = new System.IO.FileStream("abc_portrait.pdf", System.IO.FileMode.Create);
PDF_portrait.Position = 0;
System.IO.MemoryStream finalStream = new System.IO.MemoryStream();
PDF.CopyTo(finalStream);
PDF_portrait.CopyTo(finalStream);
System.IO.FileStream file_combined = new System.IO.FileStream("abc_combined.pdf", System.IO.FileMode.Create);
try
{
PDF.WriteTo(file);
PDF.Flush();
PDF_portrait.WriteTo(file_portrait);
PDF_portrait.Flush();
finalStream.WriteTo(file_combined);
finalStream.Flush();
}
catch (Exception)
{
throw;
}
finally
{
PDF.Close();
file.Close();
PDF_portrait.Close();
file_portrait.Close();
finalStream.Close();
file_combined.Close();
}
The PDFs "abc_landscape.pdf" and "abc_portrait.pdf" generate correctly, as expected, but the operation fails when I try to combine the two in a third pdf (abc_combined.pdf).
I am using MemoryStream to preform the merge, and at the time of debug, I can see that the finalStream.length is equal to the sum of the previous two PDFs. But when I try to open the PDF, I see the content of just 1 of the two PDFs.
The same can be seen below:
Additionally, when I try to close the "abc_combined.pdf", I am prompted to save it, which does not happen with the other 2 PDFs.
Below are a few things that I have tried out already, to no avail:
Change CopyTo() to WriteTo()
Merge the same PDF (either Landscape or Portrait one) with itself
In case it is required, below is the elaboration of the GetPdfStream() method.
var htmlStream = new MemoryStream();
var writer = new StreamWriter(htmlStream);
writer.Write(htmlString);
writer.Flush();
htmlStream.Position = 0;
return htmlStream;
Process process = Process.Start(psi);
process.EnableRaisingEvents = true;
try
{
process.Start();
process.BeginErrorReadLine();
var inputTask = Task.Run(() =>
{
htmlStream.CopyTo(process.StandardInput.BaseStream);
process.StandardInput.Close();
});
// Copy the output to a memorystream
MemoryStream pdf = new MemoryStream();
var outputTask = Task.Run(() =>
{
process.StandardOutput.BaseStream.CopyTo(pdf);
});
Task.WaitAll(inputTask, outputTask);
process.WaitForExit();
// Reset memorystream read position
pdf.Position = 0;
return pdf;
}
catch (Exception ex)
{
throw ex;
}
finally
{
process.Dispose();
}
Merging pdf in C# or any other language is not straight forward with out using 3rd party library.
I assume your requirement for not using library is that most Free libraries, nuget packages has limitation or/and cost money for commercial use.
I have made research and found you an Open Source library called PdfClown with nuget package, it is also available for Java. It is Free with out limitation (donate if you like). The library has a lot of features. One such you can merge 2 or more documents to one document.
I supply my example that take a folder with multiple pdf files, merged it and save it to same or another folder. It is also possible to use MemoryStream, but I do not find it necessary in this case.
The code is self explaining, the key point here is using SerializationModeEnum.Incremental:
public static void MergePdf(string srcPath, string destFile)
{
var list = Directory.GetFiles(Path.GetFullPath(srcPath));
if (string.IsNullOrWhiteSpace(srcPath) || string.IsNullOrWhiteSpace(destFile) || list.Length <= 1)
return;
var files = list.Select(File.ReadAllBytes).ToList();
using (var dest = new org.pdfclown.files.File(new org.pdfclown.bytes.Buffer(files[0])))
{
var document = dest.Document;
var builder = new org.pdfclown.tools.PageManager(document);
foreach (var file in files.Skip(1))
{
using (var src = new org.pdfclown.files.File(new org.pdfclown.bytes.Buffer(file)))
{ builder.Add(src.Document); }
}
dest.Save(destFile, SerializationModeEnum.Incremental);
}
}
To test it
var srcPath = #"C:\temp\pdf\input";
var destFile = #"c:\temp\pdf\output\merged.pdf";
MergePdf(srcPath, destFile);
Input examples
PDF doc A and PDF doc B
Output example
Links to my research:
https://csharp-source.net/open-source/pdf-libraries
https://sourceforge.net/projects/clown/
https://www.oipapio.com/question-3526089
Disclaimer: A part of this answer is taken from my my personal web site https://itbackyard.com/merge-multiple-pdf-files-to-one-pdf-file-in-c/ with source code to github.
This answer from Stack Overflow (Combine two (or more) PDF's) by Andrew Burns works for me:
using (PdfDocument one = PdfReader.Open("pdf 1.pdf", PdfDocumentOpenMode.Import))
using (PdfDocument two = PdfReader.Open("pdf 2.pdf", PdfDocumentOpenMode.Import))
using (PdfDocument outPdf = new PdfDocument())
{
CopyPages(one, outPdf);
CopyPages(two, outPdf);
outPdf.Save("file1and2.pdf");
}
void CopyPages(PdfDocument from, PdfDocument to)
{
for (int i = 0; i < from.PageCount; i++)
{
to.AddPage(from.Pages[i]);
}
}
That's not quite how PDFs work. PDFs are structured files in a specific format.
You can't just append the bytes of one to the other and expect the result to be a valid document.
You're going to have to use a library that understands the format and can do the operation for you, or developing your own solution.
PDF files aren't just text and images. Behind the scenes there is a strict file format that describes things like PDF version, the objects contained in the file and where to find them.
In order to merge 2 PDFs you'll need to manipulate the streams.
First you'll need to conserve the header from only one of the files. This is pretty easy since it's just the first line.
Then you can write the body of the first page, and then the second.
Now the hard part, and likely the part that will convince you to use a library, is that you have to re-build the xref table. The xref table is a cross reference table that describes the content of the document and more importantly where to find each element. You'd have to calculate the byte offset of the second page, shift all of the elements in it's xref table by that much, and then add it's xref table to the first. You'll also need to ensure you create objects in the xref table for the page break.
Once that's done, you need to re-build the document trailer which tells an application where the various sections of the document are among other things.
See https://resources.infosecinstitute.com/pdf-file-format-basic-structure/
This is not trivial and you'll end up re-writing lots of code that already exists.

Which line of code is still using the file and not releasing it?

I am using this code to create an Excel file and populate it with data:
using (ExcelPackage package = new ExcelPackage(fileInfo))
{
ExcelWorksheet ws = package.Workbook.Worksheets.Add("Deltas");
ExcelWorksheet ws2 = package.Workbook.Worksheets.Add("Images");
ExcelWorksheet ws3 = package.Workbook.Worksheets.Add("Data Points");
GenerateDataSheet(ws, true);
GenerateDataSheet(ws3, false);
// populate second worksheet with images
var imagesLocations = SelectedSession.GetTests().Where(t => t.IsReference).Select(t => t.Location).OrderBy(t => t.DateCreated).ThenBy(t => t.Name).ToList();
ws2.Column(2).Width = 58;
for (int i = 0; i < imagesLocations.Count; i++)
{
ws2.Row(i + 1).Height = 305;
ws2.Cells[i + 1, 1].Value = imagesLocations[i].Name;
ws2.Cells[i + 1, 1].Style.VerticalAlignment = ExcelVerticalAlignment.Top;
ws2.Cells[i + 1, 1].Style.HorizontalAlignment = ExcelHorizontalAlignment.Right;
var imagePath = imagesLocations[i].Tests.FirstOrDefault(t => t.IsReference).ImagePath;
if (File.Exists(imagePath))
{
var ImageToPutInReport = Image.FromFile(imagePath);
var image = ws2.Drawings.AddPicture(imagesLocations[i].Name, ImageToPutInReport);
image.SetSize(375, 375);
image.SetPosition(i, 0, 1, 0);
}
}
package.SaveAs(fileInfo);
}
After it finishes, I call a function to delete the Images Folder. The "delete()" function pops an error:
image is still in use
When I comment the above code the error does not occur. Currently I am using this hack to fix my problem:
public static void DeleteSessionFolder(string session)
{
try
{
if (Directory.Exists(baseSessionPath + session))
Directory.Delete(baseSessionPath + session, true); // error pops here
} catch (Exception e)
{
DeleteSessionFolder(session); // call it again
}
}
So I am giving a chance to keep trying again and again. But this is taking like 15 seconds till "that thing" releases the images and the application is able to delete the folder, while the whole application is frozen. Which line of code is keeping a hold of the images(an image)?
Rather than using Image.FromFile to get the image you should use a stream to read in a copy of the image and then operate using this copy. The issue with Image.FromFile is that it opens your image file by reference so that no other operation can write to it until your application stops using it. It is basically a stream that never closes until it is completely out of scope or you manually .Dispose() of the image object.
So, change this line:
var ImageToPutInReport = Image.FromFile(imagePath);
To this line:
Image ImageToPutInReport;
using (FileStream stream = File.OpenRead(imagePath))
{
ImageToPutInReport = Image.FromStream(stream);
}
From this link on Image.FromFile(string filename) it says The file remains locked until the Image is disposed. and you don't dipose the image anywhere.
That's what's causing your issue.
This line:
var ImageToPutInReport = Image.FromFile(imagePath);
Will keep your image opened and locked until you dispose it (which, since you are not doing explicitly, won't happen till the garbage collector decides to: those should be those 15 seconds you are observing when retrying)... so I'd change that block to:
using(var ImageToPutInReport = Image.FromFile(imagePath))
{
var image = ws2.Drawings.AddPicture(imagesLocations[i].Name, ImageToPutInReport);
image.SetSize(375, 375);
image.SetPosition(i, 0, 1, 0);
}
You'll need to make sure that the Image object is copied (and not linked) into the spreadsheet, otherwise, you'll need to dispose them later, but it should be the case, taking the GC behaviour you are seeing.
PS: As I mentioned above, and as #ScottChamberlain noted in the comments, the image may be added as a reference (and not as a copy), so you'd dispose an image referenced in the collection. if this is the case, we can unlock the file by creating a copy (this should free the file), and then dispose the copy later after we're done with our package... something like:
var imageList = new List<Image>();
using (ExcelPackage package = new ExcelPackage(fileInfo))
{
/* ... */
if (File.Exists(imagePath))
{
Image ImageToPutInReport;
// Make a copy of the loaded image and dispose the original
// so the file is freed
using(var tempImage = Image.FromFile(imagePath))
ImageToPutInReport = new Bitmap(tempImage);
// Add to the list of images we'll dispose later
// after we're done
imageList.Add(ImageToPutInReport);
var image = ws2.Drawings.AddPicture(imagesLocations[i].Name, ImageToPutInReport);
image.SetSize(375, 375);
image.SetPosition(i, 0, 1, 0);
}
/* ... */
package.SaveAs(fileInfo);
}
foreach(var img in imageList)
img.Dispose();
imageList.Clear();

XPS Print Quality C# vs. XPS viewer

I'm having a somewhat odd print quality problem in my C# application. I have an XPS file (it's basically just a 1 page image, that was originally a scanned black and white image) that I'm trying to print to an IBM InfoPrint Mainframe driver via a C# application. I've printed to numerous other print drivers and never had a problem, but this driver gives me terrible quality with the AFP file it creates. If I open the same file in the Microsoft XPS viewer application and print to the same driver, the quality looks fine.
Trying to work though the problem I've tried 3 or 4 different approaches to printing in the C# app. The original code did something like this (trimmed for brevity):
System.Windows.Xps.XpsDocumentWriter writer = PrintQueue.CreateXpsDocumentWriter(mPrintQueue);
mCollator = writer.CreateVisualsCollator();
mCollator.BeginBatchWrite();
ContainerVisual v = getContainerVisual(xpsFilePath);
//tried all sorts of different options on the print ticket, no effect
mCollator.Write(v,mDefaultTicket);
That code (which I've truncated) certainly could have had some weird issues in it, so I tried something much simpler:
LocalPrintServer localPrintServer = new LocalPrintServer();
PrintQueue defaultPrintQueue = LocalPrintServer.GetDefaultPrintQueue();
PrintSystemJobInfo xpsPrintJob = defaultPrintQueue.AddJob("title", xpsDocPath, false);
Same results.
I even tried using the WCF print dialog, same poor quality (http://msdn.microsoft.com/en-us/library/ms742418.aspx).
One area I haven't tried yet, is using the old-school underlying print API's, but I'm not sure why that would behave differently. One other option I have, is my original document is a PDF, and I have a good 3rd party library that can make me an EMF file instead. However, every time I try to stream that EMF file to my printer, I get garbled text.
Any ideas on why this quality is lost, how to fix, or how to stream an EMF file to a print driver, would be much appreciated!
UPDATE:
One other note. This nice sample app: http://wrb.home.xs4all.nl/Articles_2010/Article_XPSViewer_01.htm experiences the same quality loss. I've also now performed tests where I open the PDF directly and render the Bitmaps to a Print Document, same fuzziness of the resulting images. If I open the PDFs in Acrobat and print they look fine.
So to close this issue, it seems that the IBM Infoprint driver (at least the way it's being used here) has quite different quality depending on how you print in C#.
In this question I was using:
System.Windows.Documents.Serialization.Write(Visual, PrintTicket);
I completely changed my approach, removing XPS entirely, and obtained an emf (windows metafile) rendition of my document, then sent that emf file to the Windows printer using the windows print event handler:
using (PrintDocument pd = new PrintDocument())
{
pd.DocumentName = this.mJobName;
pd.PrinterSettings.PrinterName = this.mPrinterName;
pd.PrintController = new StandardPrintController();
pd.PrintPage += new PrintPageEventHandler(DoPrintPage);
pd.Print();
}
(I've obviously omitted a lot of code here, but you can find examples of how to use this approach relatively easily)
In my testing, most print drivers were equally happy with either printing approach, but the IBM Infoprint driver was EXTREMELY sensitive to the quality. One possible explanation is that the Infoprint printer was required to be configured with a weird fixed DPI and it may be doing a relatively poor job converting.
EDIT: More detailed sample code was requested, so here ya go. Note that getting an EMF file is a pre-req for this approach. In this case I'm using ABC PDF, which lets you generate an EMF file from your PDF with a relatively simple call.
class AbcPrintEmf
{
private Doc mDoc;
private string mJobName;
private string mPrinterName;
private string mTempFilePath;
private bool mRenderTextAsPolygon;
public AbcPdfPrinterApproach(Doc printMe, string jobName, string printerName, bool debug, string tempFilePath, bool renderTextAsPolygon)
{
mDoc = printMe;
mDoc.PageNumber = 1;
mJobName = jobName;
mPrinterName = printerName;
mRenderTextAsPolygon = renderTextAsPolygon;
if (debug)
mTempFilePath = tempFilePath;
}
public void print()
{
using (PrintDocument pd = new PrintDocument())
{
pd.DocumentName = this.mJobName;
pd.PrinterSettings.PrinterName = this.mPrinterName;
pd.PrintController = new StandardPrintController();
pd.PrintPage += new PrintPageEventHandler(DoPrintPage);
pd.Print();
}
}
private void DoPrintPage(object sender, PrintPageEventArgs e)
{
using (Graphics g = e.Graphics)
{
if (mDoc.PageCount == 0) return;
if (mDoc.Page == 0) return;
XRect cropBox = mDoc.CropBox;
double srcWidth = (cropBox.Width / 72) * 100;
double srcHeight = (cropBox.Height / 72) * 100;
double pageWidth = e.PageBounds.Width;
double pageHeight = e.PageBounds.Height;
double marginX = e.PageSettings.HardMarginX;
double marginY = e.PageSettings.HardMarginY;
double dstWidth = pageWidth - (marginX * 2);
double dstHeight = pageHeight - (marginY * 2);
// if source bigger than destination then scale
if ((srcWidth > dstWidth) || (srcHeight > dstHeight))
{
double sx = dstWidth / srcWidth;
double sy = dstHeight / srcHeight;
double s = Math.Min(sx, sy);
srcWidth *= s;
srcHeight *= s;
}
// now center
double x = (pageWidth - srcWidth) / 2;
double y = (pageHeight - srcHeight) / 2;
// save state
RectangleF theRect = new RectangleF((float)x, (float)y, (float)srcWidth, (float)srcHeight);
int theRez = e.PageSettings.PrinterResolution.X;
// draw content
mDoc.Rect.SetRect(cropBox);
mDoc.Rendering.DotsPerInch = theRez;
mDoc.Rendering.ColorSpace = "RGB";
mDoc.Rendering.BitsPerChannel = 8;
if (mRenderTextAsPolygon)
{
//i.e. render text as polygon (non default)
mDoc.SetInfo(0, "RenderTextAsText", "0");
}
byte[] theData = mDoc.Rendering.GetData(".emf");
if (mTempFilePath != null)
{
File.WriteAllBytes(mTempFilePath + #"\" + mDoc.PageNumber + ".emf", theData);
}
using (MemoryStream theStream = new MemoryStream(theData))
{
using (Metafile theEMF = new Metafile(theStream))
{
g.DrawImage(theEMF, theRect);
}
}
e.HasMorePages = mDoc.PageNumber < mDoc.PageCount;
if (!e.HasMorePages) return;
//increment to next page, corrupted PDF's have occasionally failed to increment
//which would otherwise put us in a spooling infinite loop, which is bad, so this check avoids it
int oldPageNumber = mDoc.PageNumber;
++mDoc.PageNumber;
int newPageNumber = mDoc.PageNumber;
if ((oldPageNumber + 1) != newPageNumber)
{
throw new Exception("PDF cannot be printed as it is corrupt, pageNumbers will not increment properly.");
}
}
}
}

Programmatically (C#) convert Excel to an image

I want to convert an excel file to an image (every format is ok) programmatically (c#). Currently I'm using Microsoft Interop Libraries & Office 2007, but it does not support saving to an image by default.
So my current work-around is as follows:
Open Excel file using Microsoft Interop;
Find out the max range (that contains data);
Use the CopyPicture() on that range, which will copy the data to the Clipboard.
Now the tricky part (and my problems):
Problem 1:
Using the .NET Clipboard class, I'm not able to get the EXACT copied data from the clipboard: the data is the same, but somehow the formatting is distorted (the font of the whole document seems to become bold and a little bit more unreadable while they were not); If I paste from the clipboard using mspaint.exe, the pasted image is correct (and just as I want it to be).
I disassembled mspaint.exe and found a function that it is using (OleGetClipboard) to get data from the clipboard, but I cannot seem to get it working in C# / .NET.
Other things I tried were the Clipboard WINAPI's (OpenClipboard, GetClipboardData, CF_ENHMETAFILE), but the results were the same as using the .NET versions.
Problem 2:
Using the range and CopyPicture, if there are any images in the excel sheet, those images are not copied along with the surrounding data to the clipboard.
Some of the source code
Excel.Application app = new Excel.Application();
app.Visible = app.ScreenUpdating = app.DisplayAlerts = false;
app.CopyObjectsWithCells = true;
app.CutCopyMode = Excel.XlCutCopyMode.xlCopy;
app.DisplayClipboardWindow = false;
try {
Excel.Workbooks workbooks = null;
Excel.Workbook book = null;
Excel.Sheets sheets = null;
try {
workbooks = app.Workbooks;
book = workbooks.Open(inputFile, false, false, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing);
sheets = book.Worksheets;
} catch {
Cleanup(workbooks, book, sheets); //Cleanup function calls Marshal.ReleaseComObject for all passed objects
throw;
}
for (int i = 0; i < sheets.Count; i++) {
Excel.Worksheet sheet = (Excel.Worksheet)sheets.get_Item(i + 1);
Excel.Range myrange = sheet.UsedRange;
Excel.Range rowRange = myrange.Rows;
Excel.Range colRange = myrange.Columns;
int rows = rowRange.Count;
int cols = colRange.Count;
//Following is used to find range with data
string startRange = "A1";
string endRange = ExcelColumnFromNumber(cols) + rows.ToString();
//Skip "empty" excel sheets
if (startRange == endRange) {
Excel.Range firstRange = sheet.get_Range(startRange, endRange);
Excel.Range cellRange = firstRange.Cells;
object text = cellRange.Text;
string strText = text.ToString();
string trimmed = strText.Trim();
if (trimmed == "") {
Cleanup(trimmed, strText, text, cellRange, firstRange, myrange, rowRange, colRange, sheet);
continue;
}
Cleanup(trimmed, strText, text, cellRange, firstRange);
}
Excel.Range range = sheet.get_Range(startRange, endRange);
try {
range.CopyPicture(Excel.XlPictureAppearance.xlScreen, Excel.XlCopyPictureFormat.xlPicture);
//Problem here <-------------
//Every attempt to get data from Clipboard fails
} finally {
Cleanup(range);
Cleanup(myrange, rowRange, colRange, sheet);
}
} //end for loop
book.Close(false, Type.Missing, Type.Missing);
workbooks.Close();
Cleanup(book, sheets, workbooks);
} finally {
app.Quit();
Cleanup(app);
GC.Collect();
}
Getting data from the clipboard using WINAPI succeeds, but with bad quality. Source:
protected virtual void ClipboardToPNG(string filename) {
if (OpenClipboard(IntPtr.Zero)) {
if (IsClipboardFormatAvailable((int)CLIPFORMAT.CF_ENHMETAFILE)) {
int hEmfClp = GetClipboardDataA((int)CLIPFORMAT.CF_ENHMETAFILE);
if (hEmfClp != 0) {
int hEmfCopy = CopyEnhMetaFileA(hEmfClp, null);
if (hEmfCopy != 0) {
Metafile metafile = new Metafile(new IntPtr(hEmfCopy), true);
metafile.Save(filename, ImageFormat.Png);
}
}
}
CloseClipboard();
}
}
Anyone got a solution? (I'm using .NET 2.0 btw)
From what I understand from your question I am not able to reproduce the problem.
I selected a range manually in Excel, chose Copy As Picture with the options as shown on screen and Bitmap selected, then I used the following code to save the clipboard data:
using System;
using System.IO;
using System.Windows;
using System.Windows.Media.Imaging;
using System.Drawing.Imaging;
using Excel = Microsoft.Office.Interop.Excel;
public class Program
{
[STAThread]
static void Main(string[] args)
{
Excel.Application excel = new Excel.Application();
Excel.Workbook wkb = excel.Workbooks.Add(Type.Missing);
Excel.Worksheet sheet = wkb.Worksheets[1] as Excel.Worksheet;
Excel.Range range = sheet.Cells[1, 1] as Excel.Range;
range.Formula = "Hello World";
// copy as seen when printed
range.CopyPicture(Excel.XlPictureAppearance.xlPrinter, Excel.XlCopyPictureFormat.xlPicture);
// uncomment to copy as seen on screen
//range.CopyPicture(Excel.XlPictureAppearance.xlScreen, Excel.XlCopyPictureFormat.xlBitmap);
Console.WriteLine("Please enter a full file name to save the image from the Clipboard:");
string fileName = Console.ReadLine();
using (FileStream fileStream = new FileStream(fileName, FileMode.Create))
{
if (Clipboard.ContainsData(System.Windows.DataFormats.EnhancedMetafile))
{
Metafile metafile = Clipboard.GetData(System.Windows.DataFormats.EnhancedMetafile) as Metafile;
metafile.Save(fileName);
}
else if (Clipboard.ContainsData(System.Windows.DataFormats.Bitmap))
{
BitmapSource bitmapSource = Clipboard.GetData(System.Windows.DataFormats.Bitmap) as BitmapSource;
JpegBitmapEncoder encoder = new JpegBitmapEncoder();
encoder.Frames.Add(BitmapFrame.Create(bitmapSource));
encoder.QualityLevel = 100;
encoder.Save(fileStream);
}
}
object objFalse = false;
wkb.Close(objFalse, Type.Missing, Type.Missing);
excel.Quit();
}
}
Regarding your second problem: As far as I know it is not possible in Excel to select both a cell range and an image at the same time. If you want to get both in an image at the same time you might have to print the Excel sheet to an image/PDF/XPS file.
SpreadsheetGear for .NET will do it.
You can see our ASP.NET (C# and VB) "Excel Chart and Range Imaging Samples" samples here and download a free trial here if you want to try it out.
SpreadsheetGear also works with Windows Forms, console applications, etc... (you did not specify what type of application you are creating). There is also a Windows Forms control to display a workbook in your application if that is what you are really after.
Disclaimer: I own SpreadsheetGear LLC
This is a bug with GDI+ when it comes to converting metafiles to a bit map format.
It happens for many EMFs that displays charts with texts. To re-create, you simply need to create a chart in excel that displays data for its X and Y axis. Copy the chart as a picture and paste in word as a metafile. Open the docx and you will see an EMF in the media folder. If you now open that EMF in any windows based paint program that converts it to a bitmap, you will see distortions, in particular, text and lines become larger and distorted. Unfortunately, it is one of those issues that Microsoft is not likely to admit or do anything about. Let's hope Google and Apple take over the office/word processing world soon as well.
Because asp.net thread does not have the right ApartmentState to access Clipboard Class, so you must write code to access Clipboard in new thread. For example:
private void AccessClipboardThread()
{
// access clipboard here normaly
}
in main thread:
....
Excel.Range range = sheet.get_Range(startRange, endRange); //Save range image to clipboard
Thread thread = new Thread(new ThreadStart(AccessClipboardThread));
thread.ApartmentState = ApartmentState.STA;
thread.Start();
thread.Join(); //main thread will wait until AccessClipboardThread finish.
....
Interestingly I have been doing this in a STA compartment for some while with success. I wrote an app that runs on a weekly basis and mails out project status reports including some graphs I generate programmatically using Excel.
Last night this failed the graphs all returned null. I'm debugging today and find no explanation just that the method Clipboard.GetImage() returns null suddenly which it did not. By setting a breakpoint at this call, I can effectively demonstrate (by pressing CTRL+V in MS-Word) that the image IS indeed in the clipboard. Alas continuing on Clipboard.GetImage() returns null (whether I'm snooping like this or not).
My code runs as a console app and the Main method has the [STAThread] attribute. I debug it as a windows forms app (all my code is in a library and I simply have two front ends to that).
Both return null today.
Out of interest I spun off the chart fetcher into a thread as outlined (and note that thread.ApartmentState is deprecated), and it runs sweet, but still, nulls are returned.
Well, I gave up and did what any IT geek would do, rebooted.
Voila ... all was good. Go figure ... is this why we all loathe computers, Microsoft Windows and Microsoft Office? Hmmmm ... There is something , something entirely transient that can happen to you PC that makes Clipboard.GetImage() fail!
If you don't mind Linux (style), you can use OpenOffice (or LibreOffice) to convert the xls first to pdf, then use ImageMagic to convert the pdf to image. A basic guide can be found at http://www.novell.com/communities/node/5744/c-linux-thumbnail-generation-pdfdocpptxlsimages .
Also, there seems to be .Net APIs for both of the programs mentioned above. See:
http://www.opendocument4all.com/download/OpenOffice.net.pdf
and
http://imagemagick.net/script/api.php#dot-net

Categories

Resources