I would like to display images in excel file which is generated out of my c# code. Currently I am generating the xls file using the following code:
StringBuilder sb = new StringBuilder();
string imageInitialPath = Request.UrlReferrer.ToString().Substring(0, Request.UrlReferrer.ToString().LastIndexOf('/'));
sb.Append("<table border=1><tr><td colspan=3 rowspan=5 style=text-align: left;><img src='" + imageInitialPath + "/images/abc.JPG'/></td></tr></table>");
Here I am giving the path of an image from some folder. The image is gone from excel when deleted from that folder. Is there a way to keep the image in excel without actually depending on source?
Thanks
The image in the resulting excel file may still be referenced via the link, since the source is html.
I would suggest creating an actual Excel file format using this free library: http://epplus.codeplex.com/
Related
I've been working on an application to read images from multiple word files and store them in one single word file using Microsoft.Office.Interop.Word in C#
EDIT: I also need to save a copy of the images on the file system, so I need the image in a Bitmap or similar object.
This is my implementation so far, which works fine:
foreach (InlineShape shape in doc.InlineShapes)
{
shape.Range.Select();
if (shape.Type == WdInlineShapeType.wdInlineShapePicture)
{
doc.ActiveWindow.Selection.Range.CopyAsPicture();
ImageData = Clipboard.GetDataObject();
object _ob1 = ImageData.GetData(DataFormats.Bitmap);
bmp = (Bitmap)_ob1;
images[i++] = bmp;
/*
bmp.Save("C:\\Users\\Akshay\\Pictures\\bitmaps\\test" + i.ToString() + ".bmp");
*/
}
}
I have:
Selected the images as InlineShapes
Copied the shape into Clipboard
Stored the shape in the Clipboard in a DataObject
Extracted the shape from the DataObject in Bitmap format and stored in a Bitmap object.
I've been told to refrain from using Clipboard in Word automation and use the Word APIs instead.
I've read up on it and found an SO answer stating the same.
I looked up many implementations of reading images from Word files on MSDN, SO etc. but could not find any without using clipboard.
How do I read images from Word files using the Word APIs from Microsoft.Office.Interop.Word namespace alone without using Clipboard ?
Word documents in the Office Open XML file format store images in Base64. So it should be possible to extract that information and convert/stream it to a file. You can access the information when the document is open in the Word application using the Range.WordOpenXML property.
string shapeBase64 = shape.Range.WordOpenXML;
This will return the entire Word Open XML in the flat file OPC format. In other words, it won't contain only the picture in Base64, but the entire zip package definition as XML that surrounds it. In my quick test, the tag the contains the actual Base64 is
<pkg:binaryData>
That's a child element of
<pkg:part pkg:name="/word/media/image1.jpg" pkg:contentType="image/jpeg" pkg:compression="store">
Note that it would also be possible for you to get the entire document's WordOpenXML in one step:
document.Content.WordOpenXML
but might then need to understand the way the InlineShapes in the document body are linked to the actual information in the "media" part.
And it would be possible, of course, to work directly with the Zip Package (using the Open XML SDK, perhaps) instead of opening the document in the Word.Application.
In my software I make 2 PDF files from 1 input file using iTextSharp. I'd like to convert these files into 2 different PNG images using GS, but something strange happens. I use this code for the conversion:
GhostscriptRasterizer rasterizer = new GhostscriptRasterizer();
rasterizer.Open(newFilePath1, gsInfo, false);
Image image = rasterizer.GetPage(300, 300, 1);
image.Save(subDirPath + serCod + "_S1.png");
rasterizer.Close();
rasterizer.Open(newFilePath2, gsInfo, false);
image = rasterizer.GetPage(300, 300, 1);
image.Save(subDirPath + serCod + "_S2.png");
rasterizer.Close();
When I save the first image it shows like a blank page and the file name is the same of newFilePath1 without .pdf but with .png.
When I look at the second image with the same file name as newFilePath2 with .png instead of .pdf, it has the image of the newFilePath1 pdf file.
How can I solve this problem?
I'd suggest you try the same operation using Ghostscript from the command line (instead of through Ghostscript.NET). If you get the same result then you can open a bug report at bugs.ghostscript.com and someone can look at it (remember to include the PDF file(s) and command lines).
Otherwise you'll have to contact jhabjan (the author of Ghostscript.NET) and have him investigate it.
I am trying to get the content of attachment. It may be an excel file, Document file or text file whatever it is but I want to store it in database so here I am using this code: -
foreach (FileAttachment file in em.Attachments)// Here em is type of EmailMessage class
{
Console.Write("Hello friends" + file.Name);
file.Load();
var stream = new System.IO.MemoryStream(file.Content);
var reader = new System.IO.StreamReader(stream, UTF8Encoding.UTF8);
var text = reader.ReadToEnd();
reader.Close();
Console.Write("Text Document" + text);
}
So By printing file.name is showing attachment file name but while printing 'text' on the console it is working if the attachment is .txt type but if it is .doc or .xls type then it is showing some symbolic result. I am not getting any text result. Am I doing something wrong or missing something. I want text result of any kind of file attachment . Please help me , I am beginner in C#
What you are seeing is what is actually in the file. Try opening one with Notepad.
There is no built-in way in .NET to show the "text contents" of arbitrary file formats. You'll have to create (preferably using third-party libraries that already solve this problem) some kind of logic that extracts plaintext from rich text documents.
See for example How to extract text from Pdf, Word and Excel documents?, Extract text from pdf and word files, and so on.
First, what do you expect when reading a binary file?
Your result is exactly what is expected. A text file can be shown as a string, but a doc or xls file is a binary file. You will see the binary content of the file. You will need to use a tool/lib to get the text/content from a binary file in human readable format.
TXT type is simple,DOC or XLS are much more complex.You can see TXT because is just text,DOC or XLS or PPT or something else needs to be interpreted by other mechanism.
See,for example,you have different colors or font sizes on a Word document,or a chart in an Excel document,how can you show that in a simple TextBox or RichTextBox?Short answer,you can't.
I have to make an application which can get the list of fonts for a pdf and .indd file in an excel sheet. After lot of research I came to know that with C# it is not possible.I came across Indesign Navigator API in Visual Studio which can be integrated to the VS IDE. Iam aware of C#, javascript is there any way by which this could be made and can be run on MAC and windows OS both. Thank You!!
One way you could do this is by saving a text file out of InDesign and Acrobat with the font information. You could probably use extendscript to do this. The text file can then be imported easily into Excel as a csv or text file (whitespace delimited).
You weren't very clear about what your intentions are, but here's an example of a javascript that can pull font information out of InDesign to save a list of fonts for a document.
var doc = app.activeDocument;
var docFonts = doc.fonts.everyItem().getElements();
var fileContents = "";
for (var i=0; i < docFonts.length; i++) {
var font = docFonts[i];
fileContents += font.name + "\n";
};
var newFilePath = doc.filePath + "/" + doc.name.replace(/\.indd/,'') + "_fonts.txt";
var newFile = File(newFilePath);
newFile.open('w')
newFile.write(fileContents);
here is a possible approach...
It is possible to write out an XML representation of an InDesign file...
To generate IDML, choose File > Export Format: InDesign Markup (INDML)...
This is a zip with all the information.
There is a folder Resources which contains Fonts.xml (Resources: Fonts.xml)
This can be parsed cross-plattform because it just XML...
Here you find a description of the anatomy of a INDML InDesign Document...
http://www.indesignsecrets.com/downloads/Anatomy_of_IDML.pdf
Hope this helps...
currently i have been using the following code and i am using some dll files from pdfbox
FileInfo file = new FileInfo("c://aa.pdf");
PDDocument doc = PDDocument.load(file.FullName);
PDFTextStripper pdfStripper = new PDFTextStripper();
string text = pdfStripper.getText (doc);
richTextBox1.Text = qq;
using this code i can able to get text file but not in a correct format plz give me a some ideas
Extracting the text from a pdf file is anything but trivial.
To quote from th iTextSharp tutorial.
"The pdf format is just a canvas where
text and graphics are placed without
any structure information. As such
there aren't any 'iText-objects' in a
PDF file. In each page there will
probably be a number of 'Strings', but
you can't reconstruct a phrase or a
paragraph using these strings. There
are probably a number of lines drawn,
but you can't retrieve a Table-object
based on these lines. In short:
parsing the content of a PDF-file is
NOT POSSIBLE with iText."
There are several commercial applications which claim to be able to do it. Caveat Emptor.
There is also a free software library called Poppler http://poppler.freedesktop.org/ which is used by the pdf viewers of GNOME and KDE. It has a function called pdftotext() but I have no experience with it. It may be your best free option.
There is a blog article explaining the issues with PDF text extraction in general at http://pdf.jpedal.org/java-pdf-blog/bid/12670/PDF-text