Create HTML Document and place table in it using C# (Windows Application) - c#

I want to create a HTML document and create table init using C#. I don't want to use ASP or any thing like that. I want to do this by using C# Windows Application.
The created document should not use MS Word or may not depend on any other app and save it to any folder (C:\) etc. It is totally independent of any other MS product and can run in any PC

Something like this :)
String exportdirectory = "c:";
StreamWriter sw;
sw = File.CreateText(exportDirectory + "filename.html");
sw.WriteLine("<table>");
sw.WriteLine("<tr>");
sw.WriteLine("<td>");
sw.WriteLine("contents of table!");
sw.WriteLine("</td>");
sw.WriteLine("</tr>");
sw.WriteLine("</table>");
sw.Close();

This is just creating a string ( maybe using a StringBuilder ) and appending data, then save it with a StreamWriter. If you want something more versatile, try to use some sort of stringtemplate, http://www.antlr.org/wiki/display/ST/StringTemplate+Documentation for example, this could allow you to isolate the "view" portion of your application allowing some sort of configuration to easily drive the output generation.

Well its not clear why you have such requirement or why are u mentioning MS Word... it doesn't make sense...
But what you want is your programme to Create a HTMl File.. (you dont even need ASP.NEt for that.. its for web server programming...)
My best guess is you want something to emit data in HTML format according to some user requirements.
For that..
Simply just create a file with HTML Extension and start Emitting HTML specific tags to it.
Few guys have already mentioned how to do that.
A quick and Dirty way would be somthing like this...
public void CreateFile()
{
StringBuilder sb = new StringBuilder();
sb.Append("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd \"> \r\n ");
sb.Append("<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n");
sb.Append("<head>\r\n");
sb.Append("<meta content=\"text/html; charset=utf-8\" http-equiv=\"Content-Type\" />\r\n");
sb.Append("<title>My HTMl Page</title>\r\n");
sb.Append("</head>\r\n");
sb.Append("<body>\r\n");
sb.Append("<table>\r\n");
//If you are emiting some data like a list of items
foreach (var item in Items)
{
sb.Append("<tr>\r\n");
sb.Append("<td>\r\n");
sb.Append(item.ToString());
sb.Append("</td>\r\n");
sb.Append("</tr>\r\n");
}
sb.Append("</table>\r\n");
sb.Append("</body>\r\n");
sb.Append("</html>\r\n");
System.IO.StreamWriter sr = new System.IO.StreamWriter("Your file path .html");
sr.Write(sb.ToString());
}
There is also a System.Web.UI.HTML32TextWriter class that is specially ment for this purpose but you didn't wanted anything from ASP, so...

Related

scraping data from website with a C# console application

I'm trying to learn Spanish and making some flash cards (for my personal use) to help me learn the verbs.
Here is an example, page example. So near the top of the page you will see the past participle: bloqueado & gerund: bloqueando. It is these two values that I wish to obtain in my code and use for my flash cards.
If this is possible I will use a C# console application. I am aware that scraping data from a website is not ideal however this is a once off.
Any guidance on how to start something like this and pitfalls to avoid would be very helpful!
I know this isn't an exact answer, but here is the process I would suggest.
https://www.gnu.org/software/wget/ and mirror the website to a
folder. Wget is a web spider and will follow the links on the site until it has downloaded everything. You'll have to run it with a few different parameters until you figure out the correct settings you want.
Use C# to run through each file in the folder and extract the
words from <section class="verb-mood-section"> in each file. It's your choosing of whether you want to output them to the console or store them in a database or flat file.
Should be that easy, in theory.
Use SGMLReader. SGMLReader is a versatile and robust component that will stream HTML to an XMLReader:
XmlDocument FromHtml(TextReader reader) {
// setup SgmlReader
Sgml.SgmlReader sgmlReader = new Sgml.SgmlReader();
sgmlReader.DocType = "HTML";
sgmlReader.WhitespaceHandling = WhitespaceHandling.All;
sgmlReader.CaseFolding = Sgml.CaseFolding.ToLower;
sgmlReader.InputStream = reader;
// create document
XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.XmlResolver = null;
doc.Load(sgmlReader);
return doc;
}
You can see that you need to create a TextReader first. TThis would in reality be a StreamReader as a TextReader is an abstract class.
Then you create the XMLDocument over that. Once you've got it into the XMLDocument you can use the various methods supported by XMLDocument to isolate and extract the nodes you need. I'll leave you to explore that aspect of it.
You might try using the XDocument class as it's a lot easier to handle than the XMLDocument, especially if you're a newbie. It also supports LINQ.

C# - create and write HTML file with variables

Basiclly I'm trying to create an HTML, I already have it written but I want the user to be able to put some text on the textboxes and saving it into strings and use later when creating the HTML file.
I tried playing abit with StreamWriter but I don't think that will be the best idea.
Also I want it to open on the default web browser , or just on IE if it's easier after the file is created.
I really need help as I'm struggling especially with the creating part.
Thanks for reading!
You can also do this without external libraries.
Set up your HTML file as follows:
<!DOCTYPE html>
<html>
<header>
<title>{MY_TITLE}</title>
</header>
<body></body>
</html>
Then edit and save the HTML from C#:
const string fileName = "Foobar.html";
//Read HTML from file
var content = File.ReadAllText(fileName);
//Replace all values in the HTML
content = content.Replace("{MY_TITLE}", titleTextBox.Text);
//Write new HTML string to file
File.WriteAllText(fileName, content);
//Show it in the default application for handling .html files
Process.Start(fileName);
If you already have the HTML you want to export (just not customized), you could manually add format strings to it (like {0}, {1}, {2}) where you want to substitute text from your app, then embed it as a resource, load it in at runtime, substitute the TextBox text using string.Format, and finally write it out again. This is admittedly a really fragile way to do it, as you need to make sure the number of parameters agrees between the resource file and your call to string.Format. In fact, this is a horrible way to do it. Actually, you should do it the way #EmilePels suggests, which is basically a less fragile version of this answer.

XSL + XML -> PDF for C#

I know several people asked questions like this, but no answer helped to solve my problem.
Well, I have xsl and xml and want to generate pdf with a processor like Apache.FOP.
I am not able to use any JAVA programms like that. Just able to use C# libraries / exe.
I tried to use nFop:
Version 1.x uses Java.io and..
Version 2.0 doesn't have the ability to set XsltSettings
My current Software uses XSL + XML -> HTML (using standard Stystm.Xml.Xsl on C#) and wktmltopdf to generate PDF from created HTML.
But tables got split when they are too long for the page, and on the next page you don't have any column headers (this is very important for my problem).
I think there are no Free FO-Processor for pure C
Have a look at FoNET.
public static bool XMLToPDF(string pXmlFile, string pXslFile, string pFoFile, string pPdfFile)
{
string lBaseDir = System.IO.Path.GetDirectoryName(pXslFile);
XslCompiledTransform lXslt = new XslCompiledTransform();
lXslt.Load(pXslFile);
lXslt.Transform(pXmlFile, pFoFile);
FileStream lFileInputStreamFo = new FileStream(pFoFile, FileMode.Open);
FileStream lFileOutputStreamPDF = new FileStream(pPdfFile, FileMode.Create);
FonetDriver lDriver = FonetDriver.Make();
lDriver.BaseDirectory = new DirectoryInfo(lBaseDir);
lDriver.CloseOnExit = true;
lDriver.Render(lFileInputStreamFo, lFileOutputStreamPDF);
lFileInputStreamFo.Close();
lFileOutputStreamPDF.Close();
return System.IO.File.Exists(pPdfFile);
}

Using the OpenXml SDK 2.0 to insert tables in a word document

I am just starting out with the OpenXML SDK 2.0 in Visual Studio 2010 (C#). I have automated office programs before using COM automation, which was painful.
I have a template made by one of our graphic designers, which will provide the foundation for my reports. In order to automate the simple things (plaintext items) I have added content controls to the template and bound a custom XML part to the doc. The content controls are as follows:
DayCount
AlternateJobTitle
Date
SignatureName
After making a copy of the template, I then edit the content controls and save the file with the following code:
//stand up object that reads the Word doc package
using (WordprocessingDocument doc = WordprocessingDocument.Open(docOutputPath, true))
{
//create XML string matching custom XML part
string newXml = "<root>" +
"<DayCount>42</DayCount>" +
"<AlternateJobTitle>Supervisor</AlternateJobTitle>" +
"<Date>9/24/2012</Date>" +
"<SignatureName>John Doe</SignatureName>" +
"</root>";
MainDocumentPart main = doc.MainDocumentPart;
main.DeleteParts<CustomXmlPart>(main.CustomXmlParts);
//add and write new XML part
CustomXmlPart customXml = main.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter ts = new StreamWriter(customXml.GetStream()))
{
ts.Write(newXml);
}
}
This all works well. However, my document is not made up solely of standard text and plaintext updates. The real meat of the report is in a number of tables that need to be added to each report as well. I have been searching like crazy for a good description on how this is done, but have really not found anything. Is there some way to delineate where to place a table using the same content control logic used for plaintext controls? Any code samples I have found of creating a table using OpenXML have just assumed that you want to append it to the end of the main document part. I would like to specify where the tables need to go in the template, generate the tables and place them in the specified regions of the template. Is this possible?
Any help is greatly appreciated.
There are a lot of OpenXml creation questions. But if you decide to take this path - answer is general - examine OpenXml Productivity Tool. At my PC it could be found at "C:\Program Files (x86)\Open XML SDK\V2.0\tool\OpenXmlSdkTool.exe". Just create in MsWord document which you want to create using OpenXml and reflect document's code using this tool. Good luck!
If you need to display tabled data, so far, the best thing I found is Word Document Generator at http://worddocgenerator.codeplex.com/.

Need Alternative to EO.Pdf for Converting HTML to PDF in C#, wkhtmltopdf?

I am creating a HTML catalog of movies, and then converting it to PDF. I was using EO.Pdf, and it worked for my small test sample. However, when I run it against the entire list of movies, the resulting HTML file is nearly 8000 lines, and 7MB. EO.Pdf times out when attempting to convert it. I believe it is a limitation of the free version, as I can copy the entire HTML and paste it into their online demo and it works.
I am looking for an alternative to use. I am not good with command line, or running external programs, so I would prefer something I can add to the .NET library and use easily. I will admit that the use of EO.Pdf was easy, once I added the dll to the libarary and added the namespace, it took one line of code to convert either the HTML Code, or the HTML file into a PDF. The downsides I ran into were that they had a stamp on every page (in 16pt font) with their website on it. It also wouldn't pick up half of my images, not sure why. I used a relative URL in the HTML file to the images, and I created the PDF in the same dir as the HTML file.
I do not want to re-create the layout in a PDF, so I think something like iTextSharp is out. I've read a bit about something called like wkhtmltopdf or something strange like that. It sounded good, but needed a wrapper, and I have no clue how to accomplish that, or use it.
I would appreciate suggestions with basic instructions how to use them. Either a library and a couple lines on how to use it. Or if you can tell me how to setup/use the wkhtmltopdf I would be extremely greatful!
Thanks in advance!
I'm using wkhtmltopdf and I'm very happy with it. One way of using it:
Process p = new Process();
p.StartInfo.CreateNoWindow = true;
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.FileName = "wkhtmltopdf.exe";
p.StartInfo.Arguments = "-O landscape <<URL>> -";
p.Start();
and then you can get a stream:
p.StandardOutput.BaseStream
I used this because I needed a stream, of course you can invoke it differently.
Here is also a discussion about invoking wkhtmltopdf
I just saw, that someone is implementing a c# wrapper for wkhtmltopdf. I haven't tested it, but may be worth a look.
After much searching I decided to use HiQPdf. It was simple to use, fast enough for my needs and the price point was acceptable to me.
var converter = new HiQPdf.HtmlToPdf();
converter.Document.PageSize = PdfPageSize.Letter;
converter.Document.PageOrientation = PdfPageOrientation.Portrait;
converter.Document.Margins = new PdfMargins(15); // Unit = Points
converter.ConvertHtmlToFile(htmlText, null, fileName);
It even includes a free version if you can keep it to 3 pages.
http://www.hiqpdf.com/free-html-to-pdf-converter.aspx
And no, I am in no way affiliated with them.
I recommend ExpertPdf.
ExpertPdf Html To Pdf Converter is very easy to use and it supports the latest html5/css3. You can either convert an entire url to pdf:
using ExpertPdf.HtmlToPdf;
byte[] pdfBytes = new PdfConverter().GetPdfBytesFromUrl(url);
or a html string:
using ExpertPdf.HtmlToPdf;
byte[] pdfBytes = new PdfConverter().GetPdfBytesFromHtmlString(html, baseUrl);
You also have the alternative to directly save the generated pdf document to a Stream of file on the disk.

Categories

Resources