Scrape a javascript-generated website in C# without installing a browser - c#

I am developing a website crawler API to scrape a javascript-generated website. The website that we are crawling requires the Javascript to be enabled to fully-render the HTML. I have tried many solutions such as HtmlAgilityPack and AngleSharp, but they are just HTML parsers and they cannot render the page due to missing Javascript capability.
I tried implementing headless browser using Selenium.WebDriver.ChromeDriver, it worked very well in my local machine. However, our production environment is very limited such that only Internet Explorer browser is available and we are not allowed to install any more browser. So this chromedriver did not work, too. Internet Explorer cannot even fully render the website from the browser itself. So IE is definitely out.
Is there a way to scrape a javascript-generated website without having to install a browser? Like implementing a headless browser on a server without that browser installed?
Or is it a dead-end situation. Thanks!

You can try using a solution that uses a fully-functional built-in Chromium and doesn't require installing Google Chrome in the target environment. All the required Chromium binaries will be shipped with the solution.
There are many such solutions for .NET and C#:
CefSharp
An open source .NET wrapper around the Chromium Embedded Framework (CEF). It allows you to embed Chromium in .NET apps.
Supported by community. If you need help with the library use, read docs or ask community. If you need a feature or a bug fix, you would probably need to do it by yourself.
DotNetBrowser
A commercial library that allows integrating a Chromium-based browser with your .NET app to display and process HTML5, CSS3, JavaScript, etc.
It's a proprietary solution supported by a commercial company. If you need help with the library use, read docs or get help from the engineers of this product. If you need a feature or a bug fix, it will be done by the product team as soon as possible. I know that, because I know the engineers from DotNetBrowser team.
WebView2
This control allows you to embed web technologies (HTML, CSS, and JavaScript) in your native apps. The WebView2 control uses Microsoft Edge (Chromium) as the rendering engine to display the web content in native apps. With WebView2, you can embed web code in different parts of your native app, or build all of the native app within a single WebView instance. Supported by Microsoft.
If you need some help, you should contact WebView2 team.

Related

How do I embed Developer Tools into GeckoFx web browser?

I'm building a developer-themed browser in a C# .NET Windows Form Application and want to allow users to use Chrome or Firefox DevTools to edit/debug the current page they are viewing. I have found several repositories online but none of them seem to be what I want.
Example Chrome DevTools
Example Firefox DevTools
The project uses Geckofx60.64 to create an embedded Gecko web browser. I have already tried debugger.html on GitHub but that didn't help me. All I need is a simple way to show website developer tools for any link.
If this is not possible, are there some other developer tools that I might be able to use with this project?

How do I update WebBrowser for my C# Application?

When I try to Open google map in my C# Application, I get this error:
You seem to be using an unsupported browser. Old browsers can put your
security at risk, are slow and don't work with newer Google Maps
features. To access Google Maps, you'll need to update to a modern
browser.
How can I upgrade my Browser there or How to get rid of this error?
Take a look at the approaches for upgrading IE in WebBrowser in this thread: use latest version of ie in webbrowser control
Alternatively, you can go for Chromium-based browser controls, to avoid compatibility issues. For example, here is a tutorial on embedding Google Maps using DotNetBrowser: Embed Google Maps In .NET Desktop Application.

Custom C# Web Browser cannot support javascript

I'm creating a web browser that has an automatic loading of specific web pages, but the problem is that the browser that i created using c# in visual studio wont load javascipt, the browser that i created only load html file, but don't support java script. can anyone help me on how to add some functionality on my custom web browser that will support javascript.
WebKit DotNet is the best port of WebKit powerful browser engine into DotNetFramework.
It has nice and easy tutorials and properties and methods to customize.
It has JavaScript activated by default.
http://webkitdotnet.sourceforge.net/ is the official website.
http://webkitdotnet.sourceforge.net/basics.php is where you can find the basic tutorial
Webbrowser control is really crappy. I'm assuming you're trying to scape some kind of website. For this use HttpWebRequest instead.
If you`re trying create your own webbrowser: don't, or use Webkit or Gecko instead.
In case you're using the webbrowser control you will have to enable JavaScript in your Internet Explorer settings, because the webbrowser control is the Internet Explorer or at least the engine of it. IE has local JavaScript disabled by default, so this could be your problem. As user #user3855678 said I would recommend using Webkit etc too.

Uploading files using AJAX and JQuery in an ASP.NET application

I have several input type files in my asp.net Web Form.
How can I upload files to Server using Jquery, AJAX and C#?
The uploaders are generated programmatically so I cannot upload the files using code behind.
Also, many files must be uploaded at once.
Is there anyway I read the file via Jquery, send it via AJAX to server and upload it there?
THanks
Have a look at Fine Uploader. It does not use flash or java. In fact, it does not have any required dependencies. An optional jQuery plug-in is provided, if you use jQuery though.
Support: IE10-7, Chrome, Firefox, Safari (OS X), as well as Android tablets and phones, along with IOS6 tablets and phones (iPhone & iPad). The Microsoft Surface tablet has also been tested.
There are many features to choose from. Have a look at the demos and, more importantly, the docs and associated blog posts for more details.
Furthermore, there are many server-side examples that may be helpful during integration of this library into your app. See the server directory in the Github project. ASP.NET is one of the many examples.

Is there a way to display HTML page that is not dependent on IE?

My client WPF application needs to display an HTML page. I understand that the webbrowser control uses the version of IE that is installed on the box.
Is there a control to render HTML that can be totally embedded into my application so that it is not dependent on the version of IE, that the user has installed?
What would happed if a user is using IE6?
Thanks
You could always use WebKit .NET instead. It allows you to embed a WebKit browser in your .NET application without having to have extra software installed on the machine.
There is also geckofx if you would rather go the Mozilla Gecko route.

Categories

Resources