C# web scraping Javascript - c#

Suppose my website has source code:
<!DOCTYPE html>
<html>
<head>
<title>Wow</title>
</head>
<body>
<div id="hello">
</div>
<script type="text/javascript">
function simple()
{
$("#hello").append("<p>Hello</p>");
}
</script>
</body>
</html>
I want a C#/asp.Net method to extract it's source code as follows:
<!DOCTYPE html>
<html>
<head>
<title>Wow</title>
</head>
<body>
<div id="hello">
</div>
<p>Hello</p>
</body>
</html>
string src=new WebClient().DownloadString("http://mywebsite.com")
doesn't help as it extracts the raw html code along with the javascript, same as the source code.

From looking at the comments above I understand that you're looking for the scripts to be executed before you get the final HTML DOM.
What you need is a JavaScript engine to run the scripts, the basic mechanics of a browser.
Depending on your needs, you can either implement this yourself with V8 (Chrome's JavaScript engine) or SpiderMonkey (Mozilla's JavaScript engine) or use one of the two popular headless browser frameworks: PhantomJS and CasperJS. Using them will also cover all your future AJAX needs should you have any.

Related

Create a partial view for the imports of css and js asp.net core 2.2

It is a question regarding design. In my ASP.Net Core MVC app I have 2 Layouts. My Default Layout and my Admin Layout which are quite self explanatory by there name. I do import the same js and css for both of my layouts e.g. Bootstrap and jQuery and some more. I wonder if I should create a partial view which contains these.
There might be different solutions which I do not know about.
Any help is appreciated.
You can use nested layouts to have a hierarchy of layouts. I have a similar scenario as yours. I have a _MasterLayout.cshtml in Shared with the full set of CSS and JS I pull in for all pages. Then create a separate layout file for different sections. Reference the master layout at the top and then include all of the other sections adding specific code for that layout.
So you could have a MasterLayout like:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta http-equiv="content-security-policy" content="upgrade-insecure-requests" />
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
<link rel="stylesheet" href="~/css/site.css" asp-append-version="true" />
#RenderSection("Styles", required: false)
<title>#ViewData["Title"]</title>
</head>
<body>
<div class="container body-content">
#RenderBody()
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" crossorigin="anonymous"</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
<script src="~/js/site.js" asp-append-version="true"></script>
#RenderSection("Scripts", required: false)
</body>
</html>
And a separate nested layout like the following:
#{
Layout = "_MasterLayout";
}
#RenderSection("Styles", required: false)
<nav class="navbar navbar-expand-md navbar-dark fixed-top bg-dark">
// custom navbar for anonymous users
</nav>
<div class="container body-content">
#RenderBody()
</div>
#section Scripts {
#RenderSection("Scripts", required: false)
}
There's not a lot of documentation for nested layouts. Here's one other article I found describing the approach.
I can't tell if it would be good or not to create a partial for both layouts, but I would recommend a hidden, tiny tag helper for script tag (and link as well) that I think you might find useful. It is asp-src-include.
<script asp-src-include="/assets/js/*.js"></script>
is rendered to the html like;
<script src="/assets/js/jquery.js"></script>
<script src="/assets/js/bootstrap.js"></script>
<script src="/assets/js/custom.js"></script>
and same functionality applies to link tag as well.
I think this might tidy up your layouts a bit. Yo can find out detailed posts about these tag helpers here and here

Print web page to PDF in C#

For a school project I have to print webpages to PDF in C#.
I can do this manually by pressing ctrl+p in chrome en choosing save as pdf.
I want to do it with my program because I have to print 30000 web pages.
An example of a webpage I have to print is "http://prf.icecat.biz/?shopname=openIcecat-url&smi=product&vendor=HP&prod_id=D8G49AAE&lang=nl"
I'm new around here so if I didn't give enough information, please ask.
The code also looks quite special to me, it is:
<!DOCTYPE html>
<html>
<body>
<div id="light_wrapp">
<div id="light_title_description">
This is a demo of a seamless insert of an Icecat LIVE product data-sheet in your website. Imagine that this
responsive data-sheet is included in the product page of your webshop. How to integrate Icecat LIVE JavaScript.
</div>
</div>
<div id="loadLiveIcecat"></div>
<input type="hidden" id="liveIcecatData"
data-icecat_id=""
data-brand="HP"
data-part_code="D8G49AAE"
data-ean_upc=""
data-language_code="nl"
datasignature="
gn4vxXU3uI7JPKzL1RNoPAFGPRYdNmwmWJ5FeQj1J45Ba2qiaVYgoE%2FEXwpjaBSdKTrk%2F
%2ByI0AiT5CW1bk3r6SmHY%2FdUuQN3fss8h0se8w9mSw7FDr8SWvqoawB3m69lt6Ske4n%2F01IpEFO9Y
QEVpW9kFZRPWH%2Fuy0yuIbDdm4pd7%2BT9XbPFEk0fKGRH57Nkj5%2FvNEiMW1JhdzV86rJR5ME11j3P
0PJhBMGT1tm2AA0uiDILSNuOwnTWc2WVFEHzC4xr8Q591rPC%2B%2Bue230VowhLLmZTczheEGWrTCA
Wl%2B5Fj4qeLjLe3qTq1MxQaqSIUJXG5rz0BdR%2Fe8ZwkMNAgQ%3D%3D"
data-timestamp="1453663924"
data-shopname="openIcecat-url"
>
</body>
</html>
<link rel="stylesheet" type="text/css" href="/themes/basic/css/icecat/demo.css" >
<script type="text/javascript" src="https://live.icecat.biz/js/live-current.js"></script>
<script type="text/javascript" src="http://prf.icecat.biz/themes/basic/js/icecat/live/init.js"></script>
You can try using ABCpdf,a third-party library.
ABCpdf api document
It make webpag2Pdf easy.
Doc theDoc = new Doc();
theDoc.AddImageUrl("http://www.google.com/");
theDoc.Save(Server.MapPath("htmlimport.pdf"));
theDoc.Clear();

use DateTimePicker on aspx

I have a Form project that use DateTimePicker and I have to include those components into asp.net aspx webapp.
So I'm trying to include a DateTime Picker on asp.net webform but I can't.
I'm not an expert with aspx so maybe it's really a simple question.
I only see calendar in toolbox but not DateTimePicker.
The one I want to use is
System.Windows.Forms.DateTimePicker
could it be possible to include it into asp.net project?
You can use something like below on the Calender Click Method:
DateTime date = Convert.ToDateTime(calendar.SelectedDate.ToString());
Here calendar is the ID of your calendar control!
Hope this helps.
Because it'is a web application i could just use JQuery UI
http://jqueryui.com/datepicker/#inline
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>jQuery UI Datepicker - Display inline</title>
<link rel="stylesheet" href="//code.jquery.com/ui/1.11.3/themes/smoothness/jquery-ui.css">
<script src="//code.jquery.com/jquery-1.10.2.js"></script>
<script src="//code.jquery.com/ui/1.11.3/jquery-ui.js"></script>
<link rel="stylesheet" href="/resources/demos/style.css">
<script>
$(function() {
$( "#datepicker" ).datepicker();
});
</script>
</head>
<body>
<form runat="server" id="form1">
Date: <div id="datepicker"></div>
</form>
</body>
</html>
Here is a full API description http://api.jqueryui.com/datepicker/
May be this can help you as an alternative to do this also !\
There is also an Ajax library for .NET. you can try also: http://www.codeproject.com/Tips/407460/How-to-use-ASP-NET-AJAX-Calender-Extender

facebook comment plugin asp.net

i want to include facebook comment box in facebook application built in asp.net
i have already tried facebook comment plugin but somehow it does not work and gives all.jss errors can anyone suggest working sample
for example my current one looks like.
<html xmlns:fb="http://www.facebook.com/2008/fbml">
<head runat="server">
<title></title>
<%-- <script src="http://static.ak.connect.facebook.com/js/api_lib/v0.4/FeatureLoader.js.php"
type="text/javascript" />--%>
<%-- <script type="text/javascript">
FB.init("163659850382295", "xdreciever.htm");
</script>--%>
</head>
<body>
<form id="form1" runat="server">
<div id="fb-root">
</div>
<div class="fb-comments" data-href="http://aspspider.ws/prashantkurlekar/likeTest.aspx"
data-num-posts="10" data-width="500">
</div>
</form>
</body>
</html>
It looks like you are using the original Comments box, which has been replaced by a newer version. I would view the current documentation and see if the errors disappear.

How to avoid multiple $(document).ready()

I work on a ASP.NET application using Jquery. Jquery is really powerfull and I use a lot of it.
In my master page I include all libraries I use and a js file who contain the jquery code available for all the application (interface interactions). In this js File (Main.js) I make some things so I use the $(document).ready( ... etc .. )
But in some pages, who are more complex I need to use some other jquery code.. So I add some head Content with other script tag.. And this the problem, I have to add the $(document).ready() instruction again.
There is a lot of problems with my asp controls with this way to do, the autopostback controls doesn't do their autopostback.. I think this is a problem with the multiple $(document).ready() declaration because when I remove the second one(in the page not in the master page) the controls are working.
So how can I do to add some javascript code in a specific page without multiple $(document).ready() declaration. (I don't want to embed all the jquery code in all pages).
I hope I'm clear enough, thanks for responses
Edit here is code
Master page part
<%# Master Language="C#" AutoEventWireup="true" CodeBehind="Site.master.cs" Inherits="SiteMaster" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head runat="server">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<link href="~/Styles/Site.css" rel="stylesheet" type="text/css" />
<link href="Styles/jquery-ui-1.8.9.custom.css" rel="stylesheet" type="text/css" />
<link href="Styles/menu.css" rel="stylesheet" type="text/css" />
<script src="/js/jquery-1.4.4.min.js" type="text/javascript"></script>
<script src="/js/jquery-ui-1.8.7.custom.min.js" type="text/javascript"></script>
<script src="/js/jquery.cookie.js" type="text/javascript"></script>
<script src="/js/jquery.ui.datepicker-fr.js" type="text/javascript"></script>
<script src="/js/jquery.color.js" type="text/javascript"></script>
<script src="/js/Menu.js" type="text/javascript"></script>
<script src="/js/iphone-style-checkboxes.js" type="text/javascript"></script>
<script src="/js/jquery.tools.min.js" type="text/javascript"></script>
<script src="/js/Main.js" type="text/javascript"></script>
<asp:ContentPlaceHolder ID="HeadContent" runat="server">
</asp:ContentPlaceHolder>
</head>
<body>
Some content....
</body>
</html>
Main.js
$(document).ready(function () {
/// <reference path="jquery-1.4.4-vsdoc.js" />
//There is a lot of content here......
});
And A page
<%# Page MasterPageFile="~/Site.master" Language="C#" AutoEventWireup="true" CodeBehind="Dep.aspx.cs" Inherits="Dep" %>
<asp:Content ID="HeadContent1" ContentPlaceHolderID="HeadContent" runat="server">
<link href="../../Styles/nyroModal.css" rel="stylesheet" type="text/css" />
<script src="../../js/jquery.nyroModal.custom.min.js" type="text/javascript"></script>
<script type="text/javascript">
$(document).ready(function () {
$('#tbxDateDebut').datepicker();
$('#tbxDateFin').datepicker();
$('.nyroModal').nyroModal();
});
</script>
</asp:Content>
<asp:Content ID="Content1" ContentPlaceHolderID="MainContent" runat="server">
//Here comes the controls... (lot of code)
</asp:Content>
main.js
$(document).ready(function () {
//There is a lot of content here......
if ($.pageReady) $.pageReady($);
});
page.js
<script type="text/javascript">
$.pageReady = function() {
// fired on DOM ready
}
</script>
The answer is in your question:
But in some pages, who are more complex I need to use some other jquery code.. So I add some head Content with other script tag.. And this the problem, I have to add the $(document).ready() instruction again.
"But in some pages..."
Unfortunately you have to add a reference for every js file that refers to uncommon elements. That is if you have page 1 with a <div id="element1"> and another page (page 2) with <div id="element676"> you would not want to include all in the Jquery handling into Main.js . In fact that would give an error if you had not yet seen page 2.
Damn you guys are quick.... as I was writing #Raynos gave the correct answer.
well an alternative is to move all of you jquery document ready to the bottom of your page
<body>
... here goes your other html ....
<script>
//$(document).ready(function(){//this is not needed
... here goes the first ready...
//});//this is not needed
//$(document).ready(function(){//this is not needed
... here goes the second ready ... and so on...
//});//this is not needed
</script>
</body>
So in effect you are using document.ready when all other elements are ready :) PS
I'm inclined to think that there's something at work here other than your multiple $(document).ready() calls. AFAIK, multiple calls against $(document).ready() (and even bubbled-up versions, like $('#someid').ready()) shouldn't cause issues as they are all aggregated together silently by jQuery for you.
I'd double-check my syntax for starters, and make sure that all your blocks are properly encapsulated, all statements end with a ;, and all that jazz.

Categories

Resources