I want to export one gridview table to excel format.
The most simple and fast forward solution that I found is from Math Berseth
http://mattberseth.com/blog/2007/04/export_gridview_to_excel_1.html
This solution works fine and was accepted by client. But now, after some months, a new feature was requested: "Just put one image logo in excel"
This is freak me out. I can't put the System.Drawing.Image in a System.Web.UI.WebControls.Image cause they are completely different, but I'm not able to just put a Path cause the excel generated will be send in e-mails so Directory structure can't be considered.
So, can I put images retrieved from bytes in Gridview to export in Math model, or exist some other way?
edit..
I walk few more steps but I'm still far away from my goal.
I can embedded images in html files using String Base64
Something like:
private string MakeImageSrcData(string filename) {
FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
byte[] filebytes = new byte[fs.Length];
fs.Read(filebytes, 0, Convert.ToInt32(fs.Length));
return "data:image/png;base64," + Convert.ToBase64String(filebytes, Base64FormattingOptions.None);
}
...
string base64 = MakeImageSrcData("D:\\Proj\\top_title.png");
TableRow tr = new TableRow();
TableCell tc = new TableCell();
Image logoEmpresa = new Image();
logoEmpresa.ImageUrl = base64;
tc.Controls.Add(logoEmpresa);
tr.Cells.Add(tc);
table.Rows.Add(tr);
This works fine with IE and FF but nothing whit excel :/
I tried spreadsheet xml, but as MSDN describes here there are no support to image type.
Some other idea?
This method works by outputting the gridview as text containing an HTML table. I would imagine you could prepend the string with a string containing an <img> tag pointing to your logo somewhere on an accessible web site.
Give it a try.
Comment
I added the following before line 61 in the GridViewExportUtil.cs file in the referenced demo:
HttpContext.Current.Response.Write("<img src='http://localhost/WebApplication2/wand.gif' />");
The image was available at the specified URL, and rendered correctly in Excel.
Related
I'm given to read a pdf texts and do some stuffs are extracting the texts. I 'm using iTextSharp to read the PDF. The problem here is that the PdfTextExtractor.GetTextFromPage doesnt give me all the contents of the page. For ex
In the above PDF I m unable to read texts that are highlighted in blue. Rest of the characters I m able t read. Below is the line that does the above
`string filePath = "myFile path";
PdfReader pdfReader = new PdfReader(filePath);
for (int page = 1; page<=1; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
}`
Any suggestions here?
I have went through lots of queries and solution in SO but not specific to this query.
The reason for text extraction not extracting those texts is pretty simple: Those texts are not part of the static page content but form fields! But "Text extraction" in iText (and other PDF libraries I know, too) is considered to mean "extraction of the text of the static page content". Thus, those texts you miss simply are not subject to text extraction.
If you want to make form field values subject to your text extraction code, too, you first have to flatten the form field visualizations. "Flattening" here means making them part of the static page content and dropping all their form field dynamics.
You can do that by adding after reading the PDF in this line
PdfReader pdfReader = new PdfReader(filePath);
code to flatten this PDF and loading the flattened PDF into the pdfReader, e.g. like this:
MemoryStream memoryStream = new MemoryStream();
PdfStamper pdfStamper = new PdfStamper(pdfReader, memoryStream);
pdfStamper.FormFlattening = true;
pdfStamper.Writer.CloseStream = false;
pdfStamper.Close();
memoryStream.Position = 0;
pdfReader = new PdfReader(memoryStream);
Extracting the text from this re-initialized pdfReader will give you the text from the form fields, too.
Unfortunately, the flattened form text is added at the end of the content stream. As your chosen text extraction strategy SimpleTextExtractionStrategy simply returns the text in the order it is drawn, the former form fields contents all are extracted at the end.
You can change this by using a different text extraction strategy, i.e. by replacing this line:
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
Using the LocationTextExtractionStrategy (which is part of the iText distribution) already returns a better result; unfortunately the form field values are not exactly on the same base line as the static contents we perceive to be on the same line, so there are some unexpected line breaks.
ITextExtractionStrategy strategy = new LocationTextExtractionStrategy();
Using the HorizontalTextExtractionStrategy (from this answer which contains both a Java and a C# version thereof) the result is even better. Beware, though, this strategy is not universally better, read the warnings in the answer text.
ITextExtractionStrategy strategy = new HorizontalTextExtractionStrategy();
I wanted to export a report with image. The stored image is in URL format in DB.
My alternative is download the file physically and pass the file path in to show the image, but this way seem redundant. Sample from here
What would be the best way to do this? Many thanks!
I end up using the code below to download the image and convert it into base64 before I display in RDLC.
var webClient = new WebClient();
byte[] imageBytes = webClient.DownloadData(urlimg);
DataRow drow = table.NewRow();
drow["filepath"] = Convert.ToBase64String(imageBytes);
RDLC are configured as image below.
How can I read pdf files and save contents to a text file using Spire.PDF?
For example: Here is a pdf file and here is the desired text file from that pdf
I tried the below code to read the file and save it to a text file
PdfDocument doc = new PdfDocument();
doc.LoadFromFile(#"C:\Users\Tamal\Desktop\101395a.pdf");
StringBuilder buffer = new StringBuilder();
foreach (PdfPageBase page in doc.Pages)
{
buffer.Append(page.ExtractText());
}
doc.Close();
String fileName = #"C:\Users\Tamal\Desktop\101395a.txt";
File.WriteAllText(fileName, buffer.ToString());
System.Diagnostics.Process.Start(fileName);
But the output text file is not properly formatted. It has unnecessary whitespaces and a complete para is broken into multiple lines etc.
How do I get the desired result as in the desired text file?
Additionally, it is possible to detect and mark(like add a tag) to texts with bold, italic or underline forms as well? Also things get more problematic for pages have multiple columns of text.
Using iText
File inputFile = new File("input.pdf");
PdfDocument pdfDocument = new PdfDocument(new PdfReader(inputFile));
SimpleTextExtractionStrategy stes = new SimpleTextExtractionStrategy();
PdfCanvasProcessor canvasProcessor = new PdfCanvasProcessor(stes);
canvasProcessor.processPageContent(pdfDocument.getPage(1));
System.out.println(stes.getResultantText());
This is (as the code says) a basic/simple text extraction strategy.
More advanced examples can be found in the documentation.
Use IronOCR
var Ocr = new IronOcr.AutoOcr();
var Results = Ocr.ReadPdf("E:\Demo.pdf");
File.WriteAllText("E:\Demo.txt", Convert.ToString(Results));
For reference https://ironsoftware.com/csharp/ocr/
Using this you should get formatted text output, but not exact desire output which you want.
If you want exact pre-interpreted output, then you should check paid OCR services like OmniPage capture SDK & Abbyy finereader SDK
That is the nature of PDF. It basically says "go to this location on a page and place this character there." I'm not familiar at all with Spire.PFF; I work with Java and the PDFBox library, but any attempt to extract text from PDF is heuristic and hence imperfect. This is a problem that has received considerable attention and some applications have better results than others, so you may want to survey all available options. Still, I think you'll have to clean up the result.
I have a pdf which produced by SSRS. I need to get this pdf as a byte array then save whole pdf as a A4.Landscape.
I try ;
string say ="hello world";
byte [] pdfArr=Encoding.UTF8.GetBytes(say)
var doc = new Document(iTextSharp.text.PageSize.A4_Landscape.Rotate());
string path = Environment.CurrentDirectory;
PdfWriter.GetInstance(doc, new FileStream(path,"/pdfdoc.pdf",FileMode.Create));
doc.Open();
doc.Add(new Paragraph(Encoding.UTF8.GetString(pdfArr)));
doc.Close();
Process.Start(path+"/pdfdoc.pdf");
When I create new pdf by iTextsharp the above code works fine but when I try for the SSRS pdf, the pdf's inside fills with meaningless characters.
Also I know that, I can read and rotate page by page via PDFReader but I don't want to read the pages. Because, the reports table is too long so it divides into pages, I don't know how many pages should involved for one table, so my main aim is showing them in horizantal (landscape) as one table.
Any suggestions or code pieces are welcomed.
Thanks anyway..
Edit : As I explained in above paragraph, I can't take pages with pdfReader or something else because I don't want to change every page as landscape and I can't. It doesn't serve my aim. I just wat to create pdf as a landscape so all the loıng tables anda datas can seen in one page.
I have a byte array that contains the data of an uploaded file which happens to be a Resume of an employee(.doc file). I did it with the help of the following lines of code
AppSettingsReader rd = new AppSettingsReader();
FileUpload arr = (FileUpload)upresume;
Byte[] arrByte = null;
if (arr.HasFile && arr.PostedFile != null)
{
//To create a PostedFile
HttpPostedFile File = upresume.PostedFile;
//Create byte Array with file len
arrByte = new Byte[File.ContentLength];
//force the control to load data in array
File.InputStream.Read(arrByte, 0, File.ContentLength);
}
Now, I would like to get the contents of the uploaded file(resume) in string format either from the byte array or any other methods.
PS: 'contents' literally refers to the contents of the resume; for example if the resume(uploaded file) contains a word 'programming', I would like to have the same word contained in the string.
Please help me to solve this.
I worked on a similar project a few years ago. Long story short... I ended up reconstructing the file and saving it on the server, then programmatically convert it to pdf, and then index the contents of the pdf, this proved much easier in practice at the time.
Alternatively, if you can restrict resume uploads to docx file format, you can use Microsofts OpenXML library to parse and index the content very easily. But in practict this may cause usability issues for users of the web site.