I am using code from other question and i am getting the error as
Error 1 The non-generic type
'iTextSharp.text.List' cannot be used
with type arguments
Error 2 The name 'HTMLWorker' does not
exist in the current context
Error 3 The type or namespace name
'HTMLWorker' could not be found (are
you missing a using directive or an
assembly reference?)
My code so far is as follows:
protected void Button2_Click(object sender, EventArgs e)
{
//Extract data from Page (pd).
Label16.Text = Editor1.Content; // Attribute
// makae ready HttpContext
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "application/pdf";
// Create PDF document
Document pdfDocument = new Document(PageSize.A4, 80, 50, 30, 65);
//PdfWriter pw = PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
pdfDocument.Open();
//WebClient wc = new WebClient();
string htmlText = Editor1.Content;
List<IElement> htmlarraylist = HTMLWorker.ParseToList(new StringReader(htmlText), null);
for (int k = 0; k < htmlarraylist.Count; k++)
{
pdfDocument.Add((IElement)htmlarraylist[k]);
}
//pdfDocument.Add(new Paragraph(IElement));
pdfDocument.Close();
HttpContext.Current.Response.End();
}
Please Help me to resolve the error. What i am trying is to get the contents (non html) from htmleditor and display in a pdf file. please confirm me whether what i am trying to do is correct or not.
1.Prefix your List like
System.Collections.Generics.List<IElement> htmlarraylist
2.Looks like you didn't import the namespace of HTMLWorker
EDIT:I googled for you ,the namespace could be any of these three.I doubt it could be the last one,but i am not sure.
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;
There's a name conflict in this code - you are using iTextSharp.text namespace and trying to use standard System.Collections.Generic.List<T> class.
Either you need to remove using iTextSharp.text and use its classes with explicit namespace or use explicit namespace for List<T>.
System.Collections.Generic.List<IElement> htmlarraylist = HTMLWorker.ParseToList(new StringReader(htmlText), null);
The third solution is to use aliases.
And for the second error, you need to import HTMLWorker namespace. Put
using iTextSharp.text.html.simpleparser;
at the top.
Related
'PdfTextExtractor' does not contain definition for 'GetTextFromPage', it throws Compiler Error CS0117
This is my code, which I have coppied just to check how does iText7 work:
using System;
using iText.Kernel.Pdf;
using iText.Kernel.Pdf.Canvas.Parser.Listener;
namespace PdfParser
{
public static class PdfTextExtractor
{
public static void ExtractTextFromPDF(string filePath)
{
PdfReader pdfReader = new PdfReader(filePath);
PdfDocument pdfDoc = new PdfDocument(pdfReader);
for (int page = 1; page <= pdfDoc.GetNumberOfPages(); page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
// the line below throws the exception
string pageContent = PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(page), strategy);
}
pdfDoc.Close();
pdfReader.Close();
}
}
}
I tried using iTextCsharp, but there was writen that iText7 is a new version.
I am working on "Console Application", maybe this is the problem? Should I use another framework?
The problem is that your class is also called PdfTextExtractor. Please rename your static class and the issue will be solved.
For future issues, you can jump to the reference (via F12 or similar, depending on your IDE/Shorctus) and check where it directs you.
I am parsing an HTML string using HTMLworker in C#.
these are the libaries I am using,
using iTextSharp.text;
using iTextSharp.text.html.simpleparser;
using iTextSharp.text.pdf;
This is how I am parsing the data:
Document pdfDoc = new Document(PageSize.A4);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
sb1.Append(#"<img src='data:image/png;charset=utf-8;base64, iVBORw0KGgoAAAANSUhEUgAAAIAAAACACAYAAADDPmHLAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAX1SURBVHhe7ZJBjuQ4DAT7/5+eHd0CBmOWhGXaDTGAuKSyRKvAnz/D0cwCHM4swOHMAhzOLMDhzAIczizA4cwCHM4swOHMAhzOLMDhzAIczizA4cwCHM4swOHMAhzOLMDhzAIczizA4cwCHM4swOHMAhzOLMDhzAIczizA4cwCHE7bAvz8/GyXWE7YyZgh+t1dO5kF+IcZot/dtZNZgH+YIfrdXTt5ZQHukLmHHXoHu8fyKrvuqdI2bdcDM/ewQ+9g91heZdc9Vdqm7Xpg5h526B3sHsur7LqnSts0eyBzk1huWJ85JdH5kmRyk1j+NG3T7IHMTWK5YX3mlETnS5LJTWL507RNswcyN4nlhvWZUxKdL0kmN4nlT9M2zR7I3CSWV+E9lETnS5LJTWL507RNswcyN4nlVXgPJdH5kmRyk1j+NG3T7IHMTWJ5Fd5DSXS+JJncJJY/Tds0eyBzk1hOrJPJKbmTm8Typ2mbZg9kbhLLiXUyOSV3cpNY/jRt0+yBzE1iObFOJqfkTm4Sy5+mbdquB1bvsb7lhvUtr7Lrnipt03Y9sHqP9S03rG95lV33VGmbtuuB1Xusb7lhfcur7LqnSts0PnCX5Av5LjtpmxY99K7kC/kuO2mbFj30ruQL+S476Z32IPYHMqfEcpLp/EZmAf5iOcl0fiOzAH+xnGQ6v5G21/APpCQ6XxLLCTsZSXS+NDKdDLvuqdI2jQ+kJDpfEssJOxlJdL40Mp0Mu+6p0jaND6QkOl8Sywk7GUl0vjQynQy77qnSO+1/4J9ADetYTqzDnGaIfnf1a3zqi6I/bGlYx3JiHeY0Q/S7q1/jU18U/WFLwzqWE+swpxmi3139Gq9/UfQnLUl0viSZnJLo/OrTdM4ivdMC+HBKovMlyeSUROdXn6ZzFumdFsCHUxKdL0kmpyQ6v/o0nbNI77QAPtzMEP3uXxpR92qV6I7lF5gFuBB1r1aJ7lh+gVmAC1H3apXojuUXeOUrMn8CO6aR6WTI3LOr8xavfFHmD2HHNDKdDJl7dnXe4pUvqv6xppHpZMjcs6vzFm1fxD+BZrC+5Qb7uzSi7vJrtH1R9GcsM1jfcoP9XRpRd/k12r4o+jOWGaxvucH+Lo2ou/wabV8U/RlXjai7JNH50oi6S5LJTSPqXu2kbVr00KtG1F2S6HxpRN0lyeSmEXWvdtI2LXroVSPqLkl0vjSi7pJkctOIulc7aZtmD2SekVj+BezbqvnTtE2zBzLPSCz/AvZt1fxp2qbZA5lnJJZ/Afu2av40bdPsgZYb1md+RxKdP+VbtE22x1puWJ/5HUl0/pRv0TbZHmu5YX3mdyTR+VO+xXuTN2N/puWG9S032KckOl920jvtQewPtNywvuUG+5RE58tOeqc9iP2BlhvWt9xgn5LofNlJ27TooXetYr+1nFinmhvV/i7apvGBu6xiv7WcWKeaG9X+Ltqm8YG7rGK/tZxYp5ob1f4u2qbtemD1Huszr5oh+t2SWN5J2+Rdj63eY33mVTNEv1sSyztpm7zrsdV7rM+8aobod0tieSdtk+2xzE1yJzdJdP6UX6DtK+zhzE1yJzdJdP6UX6DtK+zhzE1yJzdJdP6UX6DtK+zhzE2SyTMaUXdJovOrRtS92knbNHsgc5Nk8oxG1F2S6PyqEXWvdtI2zR7I3CSZPKMRdZckOr9qRN2rnbRNswcyN4nlGaq/tX4mp0bUXXbSNs0eyNwklmeo/tb6mZwaUXfZSds0eyBzk1ieofpb62dyakTdZSdt03Y90O6p5ob1qzlhJ2MnbdN2PdDuqeaG9as5YSdjJ23Tdj3Q7qnmhvWrOWEnYydt06KH3pVUc8P6mdzMUO3vom0aH7hLUs0N62dyM0O1v4u2aXzgLkk1N6yfyc0M1f4ueqcNn2MW4HBmAQ5nFuBwZgEOZxbgcGYBDmcW4HBmAQ5nFuBwZgEOZxbgcGYBDmcW4HBmAQ5nFuBwZgEOZxbgcGYBDmcW4HBmAQ5nFuBwZgEOZxbgcGYBDmcW4HBmAY7mz5//AJt02kiYlE4XAAAAAElFTkSuQmCC' runat='server' alt='myimage'>");
WebRequest.RegisterPrefix("data", new DataWebRequestFactory()); // this is to register new prefix
using (StringReader srb = new StringReader(sb1.ToString()))
{
htmlparser.Parse(srb); //here I am getting an exception
}
Here I register new prefix 'data' to get rid of the 'uri prefix not recognized' exception, as explained here: Getting exception while using itextsharp to convert from html to pdf in web api
But now I am getting a new exception that 'Input is not a valid base64 string'.
Please suggest how to solve this. Thank you.
You should remove everything in front of and including the first comma.
So change your
sb1.Append(#"<img src='data:image/png;charset=utf-8;base64, iVBORw0KGgoAAAANSUhEUgAAAIAAAACACAYAAADDPmHLAAAAAXNSR0IArs4c6QAAAARn.....);
to
sb1.Append(#"<img src='iVBORw0KGgoAAAANSUhEUgAAAIAAAACACAYAAADDPmHLAAAAAXNSR0IArs4c6QAAAARn.....);
I'm building a Console application in VS2010 ASP.NET C# using iTextSharp ver 5.5.2. I have all of the DLLs in the iTextSharp distribution referenced and have the following Using statements:
using System;
using System.IO;
using System.Net;
using System.Web;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml;
I'm looking at an iTextSupport.com posting as an example for the application that contains the following code segment:
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("results/loremipsum.pdf"));
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream("/html/loremipsum.html"));
document.close();
On the second line, creating an instance of a PDFWriter, appears the instantiation of "new FileOutputStream" which is throwing an error indicating that a Using statement or a Reference is required. Searching for FileOutputStream in the object browser for both my application AND .NET Framework 4 return no results.
Where is the class containing FileOutputStream to be found?
Maybe you've looked at some Java examples, there's no such beast as a FileOutputStream in .NET. In .NET you could use a System.IO.FileStream:
Document document = new Document();
using (var output = File.Create("results/loremipsum.pdf"))
using (var input = File.Open("html/loremipsum.html"))
{
PdfWriter writer = PdfWriter.GetInstance(document, output);
document.Open();
XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, input);
document.close();
}
Use httpcontext.current.response.writeoutputstream
When I export ARABIC data into pdf.Microsoft adobereader showing error.Adobe reader could not open file because it is either not a supported file.My code is following asp.net c#.Guide me
protected void btnExport_Click(object sender, EventArgs e)
{
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=TestPage.pdf");
Document doc = new Document(PageSize.LETTER);
doc.Open();
//Sample HTML
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.Append(#"<p>This is a test: <strong>مسندم</strong></p>");
//Path to our font
string arialuniTff = Server.MapPath("~/tradbdo.TTF");
//Register the font with iTextSharp
iTextSharp.text.FontFactory.Register(arialuniTff);
//Create a new stylesheet
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
//Set the default body font to our registered font's internal name
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.FACE, "Traditional Arabic Bold");
//Set the default encoding to support Unicode characters
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.ENCODING, BaseFont.IDENTITY_H);
//Parse our HTML using the stylesheet created above
List<IElement> list = HTMLWorker.ParseToList(new StringReader(stringBuilder.ToString()), ST);
//Loop through each element, don't bother wrapping in P tags
foreach (var element in list)
{
doc.Add(element);
}
doc.Close();
Response.Write(doc);
Response.End();
}
I found the following article which shows how to correctly export and display Arabic content via the iTextSharp library: http://geekswithblogs.net/JaydPage/archive/2011/11/02/using-itextsharp-to-correctly-display-hebrew--arabic-text-right.aspx.
Here is the code sample that you can try:
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.Text.RegularExpressions;
using System.IO;
using System.Diagnostics;
public void WriteDocument()
{
//Declare a itextSharp document
Document document = new Document(PageSize.A4);
//Create our file stream and bind the writer to the document and the stream
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(#"C:\Test.Pdf", FileMode.Create));
//Open the document for writing
document.Open();
//Add a new page
document.NewPage();
//Reference a Unicode font to be sure that the symbols are present.
BaseFont bfArialUniCode = BaseFont.CreateFont(#"C:\ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
//Create a font from the base font
Font font = new Font(bfArialUniCode, 12);
//Use a table so that we can set the text direction
PdfPTable table = new PdfPTable(1);
//Ensure that wrapping is on, otherwise Right to Left text will not display
table.DefaultCell.NoWrap = false;
//Create a regex expression to detect hebrew or arabic code points
const string regex_match_arabic_hebrew = #"[\u0600-\u06FF,\u0590-\u05FF]+";
if (Regex.IsMatch("مسندم", regex_match_arabic_hebrew, RegexOptions.IgnoreCase))
{
table.RunDirection = PdfWriter.RUN_DIRECTION_RTL;
}
//Create a cell and add text to it
PdfPCell text = new PdfPCell(new Phrase("مسندم", font));
//Ensure that wrapping is on, otherwise Right to Left text will not display
text.NoWrap = false;
//Add the cell to the table
table.AddCell(text);
//Add the table to the document
document.Add(table);
//Close the document
document.Close();
//Launch the document if you have a file association set for PDF's
Process AcrobatReader = new Process();
AcrobatReader.StartInfo.FileName = #"C:\Test.Pdf";
AcrobatReader.Start();
}
The iTextSharp.text.Document is a class used to help bridge human concepts like Paragraph and Margin into PDF concepts. The bridge part is important. It is not a PDF file in any way so it should never be treated as a PDF. Doing so would be like treating System.Drawing.Graphics as if it were an image. This leads to one of your problems on the second to last line of code that tries to treat the Document as if it were a PDF by sending it directly to the output stream:
//This won't work
Response.Write(doc);
You will find many, many tutorials out there that do this and they are all wrong. Fortunately (or unfortunately), PDF is forgiving and allows junk data at the end so only a handful of PDF fail and people assume there was some other problem.
Your other problem is that you are missing a PdfWriter. If Document is the bridge, PdfWriter is the actual construction worker that puts that PDF together. It, however, is also not a PDF. Instead, it needs to be bound to a stream like a file, in-memory or the HttpResponse.OutputStream.
Below is some code that shows this off. I very strongly recommend separating your PDF logic from your ASPX logic. Do all of you PDF stuff first and get an actual "something" that represents a PDF, then do something with it.
At the beginning we declare a byte array that we'll fill in later. Next we create a System.IO.MemoryStream that will be used to write the PDF to. After creating the Document we then create a PdfWriter that's bound to the Document and our stream. Your internal code is the same and although I didn't test it it appears correct. Right before we're done with our MemoryStream we grab the active bytes into our byte array. Lastly we use the BinaryWrite() method to send our raw binary PDF to the requesting client.
//At the end of this bytes will hold a byte array representing an actual PDF file
Byte[] bytes;
//Create a simple in-memory stream
using (var ms = new MemoryStream()){
using (var doc = new Document()) {
//Create a new PdfWriter bound to our document and the stream
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//This is unchanged from the OP's code
//Sample HTML
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.Append(#"<p>This is a test: <strong>مسندم</strong></p>");
//Path to our font
string arialuniTff = Server.MapPath("~/tradbdo.TTF");
//Register the font with iTextSharp
iTextSharp.text.FontFactory.Register(arialuniTff);
//Create a new stylesheet
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
//Set the default body font to our registered font's internal name
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.FACE, "Traditional Arabic Bold");
//Set the default encoding to support Unicode characters
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.ENCODING, BaseFont.IDENTITY_H);
//Parse our HTML using the stylesheet created above
List<IElement> list = HTMLWorker.ParseToList(new StringReader(stringBuilder.ToString()), ST);
//Loop through each element, don't bother wrapping in P tags
foreach (var element in list) {
doc.Add(element);
}
doc.Close();
}
}
//Right before closing the MemoryStream grab all of the active bytes
bytes = ms.ToArray();
}
//We now have a valid PDF and can do whatever we want with it
//In this case, use BinaryWrite to send it directly to the requesting client
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=TestPage.pdf");
Response.BinaryWrite(bytes);
Response.End();
This code snippet i use to create a excel sheet from dataset (I parse .rtf file and get make a list of datasets). But i get error with line 3 of my code snippet
Error: The type or namespace name 'WorkbookEngine' could not be found
(are you missing a using directive or an assembly reference?)
XmlDataDocument xmlDataDoc = new XmlDataDocument(DS);
XslCompiledTransform xt = new XslCompiledTransform();
StreamReader reader = new StreamReader(typeof(WorkbookEngine).Assembly.GetManifestResourceStream(typeof(WorkbookEngine), "ValidationReport1.xls"));
XmlTextReader xRdr = new XmlTextReader(reader);
xt.Load(xRdr, null, null);
StringWriter sw = new StringWriter();
xt.Transform(xmlDataDoc, null, sw, null);
StreamWriter myWriter = new StreamWriter(System.Windows.Forms.Application.StartupPath + "\\Reports\\ValidationReport1.xls");
myWriter.Write(sw.ToString());
myWriter.Close();
Your project has not referenced the namespace which contains the "WorkbookEngine" object.
At the top of your project you should have a list of USING lines. Add the appropiate line for your object.