wkhtmltopdf Header, Image & Pagenumbers - c#

I want to add a Header to my PDF with this:
--header-center TEST
and it works fine, but if i want to insert Whitespace:
--header-center TEST test
it wont be displayed. Do I have to write something instead of " "?
Another question is how to insert pagenumbers into the footer. I found this code-snippet, but I'm new in this issue and have no idea how to implement it:
var pdfInfo = {};
var x = document.location.search.substring(1).split('&');
for (var i in x) { var z = x[i].split('=',2); pdfInfo[z[0]] = unescape(z[1]); }
function getPdfInfo() {
var page = pdfInfo.page || 1;
var pageCount = pdfInfo.topage || 1;
document.getElementById('pdfkit_page_current').textContent = page;
document.getElementById('pdfkit_page_count').textContent = pageCount;
}
And my last question is how to insert Images into the footer with --header-html ~\image.html.
I inserted a link referencing a simple html with a picture but it wont be displayed.
I know... many questions. This issue is very tricky for me.
Thanks in advance!
LG FG

As in my comment, the whitespace in the text header should work if you surround it in quotes, ex --header-center "TEST test"
Okay, so I played around and found how to get the page numbers and image to work. Your header.html should look something like (notice how the image URL is the absolute path) :
<html>
<head>
<script type="text/javascript">
var pdfInfo = {};
var x = document.location.search.substring(1).split('&');
for (var i in x) { var z = x[i].split('=',2); pdfInfo[z[0]] = unescape(z[1]); }
function getPdfInfo() {
var page = pdfInfo.page || 1;
var pageCount = pdfInfo.topage || 1;
document.getElementById('pdfkit_page_current').textContent = page;
document.getElementById('pdfkit_page_count').textContent = pageCount;
}
</script>
</head>
<body onload="getPdfInfo()">
<img src="/var/sites/mysite/htdocs/images/logo.jpg" />
<br />Page <span id="pdfkit_page_current"></span> Of <span id="pdfkit_page_count"></span>
</body>
</html>
Then generate the with something like wkhtmltopdf --margin-top 40mm --header-html /var/sites/mysite/pdf/header.html content.html output.pdf
You'll have to play with --margin-top to get the right spacing. The same procedure should work for footers as well.
My source for this was http://metaskills.net/2011/03/20/pdfkit-overview-and-advanced-usage/ (PDFkit is a wrapper for wkhtmltopdf)

Related

Html Agility Pack - Select Divs inside Div

fairly new to the HTML Agility Pack. I've been searching and trying many examples but didn't get to a conclusion yet.. must be doing something wrong.. hope you can assist me.
My goal is to parse the latest news from a website, including image, title and date - pretty simple. I managed to get the image (background attribute) from the div but the divs are nested and for some reason I can't access their values. Here is my code
using System;
using HtmlAgilityPack;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
var html = #"https://pristontale.eu/";
HtmlWeb web = new HtmlWeb();
var doc = web.Load(html);
var news = doc.DocumentNode.SelectNodes("//div[contains(#class,'index-article-wrapper')]");
foreach (var item in news){
var image = Regex.Match(item.GetAttributeValue("style", ""), #"(?<=url\()(.*)(?=\))").Groups[1].Value;
var title = item.SelectSingleNode("//div[#class='article-title']").InnerText;
var date = item.SelectSingleNode("//div[#class='article-date']").InnerText;
Console.WriteLine(image, title, date);
}
}
}
This is what the HTML looks like
<div class="index-article-wrapper" onclick="location.href='article.php?id=2';" style="background-image: url(https://cdn.discordapp.com/attachments/765749063621935104/884439050562461696/1_1.png)">
<div class="meta-wrapper">
div class="article-date">5 Sep, 2021</div>
<div class="article-title">Server merge v1.264 update</div>
</div>
</div>
Currently it correctly grabs me all the 4 news articles but only the image - how do i get title and date of each? I have a fiddle here https://dotnetfiddle.net/BVcAmH
Appreciate the help
I just realized the code has been correct all along, the only flaw was the Console.WriteLine
Wrong
Console.WriteLine(image, title, date);
Correct
Console.WriteLine(image + " " + " " + title + " " + date);

CefSharp Executescriptasync Return Loop Value

I'm trying to pull specific tag values ​​on a page I created for experiment.I'm new to using Cefsharp. And I'm trying to experiment to improve myself.I was stuck for about two days in the EvaluateScriptAsync section.
I am trying to capture the values ​​of the buttons in the specific label on the page I prepared.I run the following codes by pressing a button.My page has 3 buttons with the same label.However, it prints only one of them.
<input type="button" id ="button1" value="First Button">
<input type="button" id ="button2" value="Second Button">
<input type="button" id ="button3" value="Third Button">
These are the buttons I'm trying to find.
string script = #"(function() { " +
"var button = document.querySelectorAll('input[type = \"button\"]'); " +
"if(button != null) {for (var i = 0; i < button.length; i++) { return button[i].value;
}}else{alert('not found!');}" +
"})();";
chrome.EvaluateScriptAsync(script).ContinueWith(a =>{
var response = a.Result;
if (response.Success && response.Result != null)
{
string print = (string)response.Result;
MessageBox.Show(print.ToString());
}
}, TaskScheduler.FromCurrentSynchronizationContext());
I have tried many.I think I'm making a mistake in the javascript part.I've read most of the similar topics.But I could not find a solution.
output : First Button
This worked for me. The EvaluateScriptAsync funciton can only return 1 value or a string so I made sure to convert the results in JavaScript to a JSON string object.
Then when you retrieve the result back in C# land, you can then use JSON to convert it back to an object (in this case a list of strings) and perform any operations you need on the data.
// Step 01: Generate a HTML page
var htmlPage = #"
<html>
<body>
<p>Hello!!</p>
<input type='button' id='button1' value='First Button'>
<input type='button' id='button2' value='Second Button'>
<input type='button' id='button3' value='Third Button'>
</body>
</html>";
// Step 02: Load the Page
m_chromeBrowser.LoadHtml(htmlPage, "http://customrendering/");
// Step 03: Get list of buttons on page from C# land
var jsScript = #"
// define a temp function to retrieve button text
function tempFunction() {
var result = [];
var list = document.querySelectorAll('input[type=button]');
for(var i = 0, len = list.length; i < len; i++) {
result.push(list[i].value);
}
// Important: convert object to json string before returning to C#
return JSON.stringify(result);
}
// Now execute the temp function and returns result back to C#
tempFunction();";
var task = m_chromeBrowser.EvaluateScriptAsync(jsScript);
task.ContinueWith(t =>
{
if (!t.IsFaulted)
{
var response = t.Result;
if (response.Success == true)
{
// Use JSON.net to convert to object;
MessageBox.Show(response.Result.ToString());
}
}
}, TaskScheduler.FromCurrentSynchronizationContext());
Looking at your JavaScript code sample, the problem with your code is that in your loop you have a return statement that will just return the 1st button value it comes across. Thats your problem.
If you want to interact with the resulting list in C# land you will need to convert it back from a JSON string. Just go to nuget and install the 'Newtonsoft.Json' package into your project.
Then you can write something like:
// C# land
var list = new List<string>();
list = JsonConvert.DeserializeObject<List<string>>(response.Result.ToString());

HtmlAgilityPack search url link

I create a WindownsFormApplication for a group of friends. I'm using HtmlAgilityPack for my application.
I need to find all version of taco addon's , like this:
<li><a href='https://www.dropbox.com/s/nks140nf794tx77/GW2TacO_034r.zip?dl=0'>Download Build 034.1866r</a></li>
Additionally, I need to check the latest version for downloading the file with the url as in the code below:
public static bool Tacoisuptodate(string Version)
{
// Load HtmlDocuments
var doc = new HtmlWeb().Load("http://www.gw2taco.com/");
var body = doc.DocumentNode.SelectNodes("//body").Single();
// Sort out the document to take that he to interest us
//SelectNodes("//div"))
foreach (var node in doc.DocumentNode.SelectNodes("//div"))
{
// Check for null value
var classeValue = node.Attributes["class"]?.Value;
var idValue = node.Attributes["id"]?.Value;
var hrefValue = node.Attributes["href"]?.Value;
// We search <div class="widget LinkList" id="LinkList1" into home page >
if (classeValue == "widget LinkList" && idValue == "LinkList1")
{
foreach(HtmlNode content in node.SelectNodes("//li"))
{
Debug.Write(content.GetAttributeValue("href", false));
}
}
}
return false;
}
If somebody could help me, I would really appreciate it.
A single xpath is enough.
var xpath = "//h2[text()='Downloads']/following-sibling::div[#class='widget-content']/ul/li/a";
var doc = new HtmlAgilityPack.HtmlWeb().Load("http://www.gw2taco.com/");
var downloads = doc.DocumentNode.SelectNodes(xpath)
.Select(li => new
{
href = li.Attributes["href"].Value,
name = li.InnerText
})
.ToList();

ASP.NET Generated Output Not Effected By JQuery/Javascript

I have the below code that is dynamically generates a directory tree in html list format. When I try to manipulate the list items with javascript to add a '+' to the end of the item, it doesn't work. I know the jquery is correct, I have used it on another page on the same server. Is jquery not able to manipulate data that is dynamically generated server side with asp.net?
<script langauge="C#" runat="server">
string output;
protected void Page_Load(object sender, EventArgs e) {
getDirectoryTree(Request.QueryString["path"]);
itemWrapper.InnerHtml = output;
}
private void getDirectoryTree(string dirPath) {
try {
System.IO.DirectoryInfo rootDirectory = new System.IO.DirectoryInfo(dirPath);
foreach (System.IO.DirectoryInfo subDirectory in rootDirectory.GetDirectories()) {
output = output + "<ul><li>" + subDirectory.Name + "</li>";
getDirectoryTree(subDirectory.FullName);
if (subDirectory.GetFiles().Length != 0) {
output = output + "<ul>";
foreach (System.IO.FileInfo file in subDirectory.GetFiles()) {
output = output + "<li><a href='" + file.FullName + "'>" + file.Name + "</a></li>";
}
}
output = output + "</ul>";
}
} catch (System.UnauthroizedAccessException) {
//This throws when we don't have access, do nothing and move one.
}
}
</script>
I then try to manipulate the output with the following:
<script langauge="javascript">
$('li > ul').not('li > ul > li > ul').prev().append('+');
</script>
Just an FYI the code for the div is below:
<div id="itemWrapper" runat="server">
</div>
Have you tried execute your JS after the page loads?
Something like this ...
$(function(){
$('li > ul').not('li > ul > li > ul').prev().append('+');
});
It looks like you have a couple of problems here. First you should put your jQuery code inside of $(document).ready. That ensures that the DOM has fully loaded before you try to mess with it. Secondly, your selector is looking for ul elements that are direct children of li elements. Your code does not generate any such HTML. You have li's inside of ul's but not the other way around. Also, if your directory has files in it, you are going to leave some ul elements unclosed which will mess up your HTML and Javascript.

JavaScript document.write Chrome

I've got a simple spam killer I'm trying to put together, but the text is not showing up on my form.
The javascript is:
<script language="javascript" type="text/javascript">
document.write("SPAM Killer: What is " + GetDateMonth() + " + " + GetDateDay() + "?")
</script>
In my .js file, I have these two functions defined:
function GetDateMonth() {
return date1.getMonth() + 1;
}
function GetDateDay() {
return date1.getDay() + 1;
}
The text shows up under IE8, but not under Chrome.
As a bonus: My OnClick method of my Submit form has this bit of code that is incorrectly adding my month and date:
string spamError = "The SPAM Killer answer was incorrect. ";
char[] split = spamTest.ToCharArray();
for (int i = 0; i < split.Length; i++) {
if (char.IsLetter(split[i])) {
Ok = false;
txtMessage.Text = spamError + "Non-numeric data entered.";
return;
}
}
int nTestValue = Convert.ToInt32(spamTest, 10);
if (nTestValue < 1) {
Ok = false;
txtMessage.Text = spamError + "Negatave or zero value found.";
}
DateTime dt = DateTime.Now;
int month = dt.Month;
int day = dt.Day;
int nCorrect = month + day;
if (nCorrect != nTestValue) {
Ok = false;
txtMessage.Text = spamError + string.Format("Expected {0}; Received {1}.", nCorrect, nTestValue);
return;
}
Using IE8, I see the following:
SPAM Killer: What is 2 + 3?
I enter 5, click Send, and get Expected 17; Received 5.
Don't reinvent the wheel, help read books with http://www.google.com/recaptcha
For C# code see http://code.google.com/apis/recaptcha/docs/aspnet.html
If you're adamant on sticking with your code, think about the problems around midnight, and users in other timezones. Also, a bot can very easily answer your anti-bot question, it would take me 45 seconds to code support for that, if I wrote bots.
If you're still adamant, you shouldn't use document.write anymore (not since 2002), but instead use DOM to insert the text to a tag ID like this: Change label text using Javascript
The answer, it seems, was in using the document.write() function with appending strings.
I redesigned my HTML to be more like this:
<table>
<tr>
<td colspan="2">
<b>[Human Check]</b><br />
Enter the text to the left and below exactly as it appears:
</td>
</tr>
<tr>
<td>
<script language="javascript" type="text/javascript">
document.write(GetSpamText())
</script>
</td>
</tr>
</table>
#serverfault: Thanks for your suggestion about the date property, though. That would have been a problem.
The text returned by GetSpamText() can be static or coded to create a random value (another topic).

Categories

Resources