Generate Screenshots from URL using ASP.NET and C# - c#

I want to generate screenshots of a website using its URL.
This I want to create using ASP.NET and C#, and I dont want to use any of the available tools and API(Url2Png, Wesnappr, Awesomium etc..).
Which classes of ASP.NET and C# should I explore for this ? How should I start about on this ?
Please can someone guide me on this.

Looks like fun project to do by hand...
Read W3C site for HTML and CSS specifications (4+5 for HTML and 1+2+3 for CSS)
Implement your own HTML engine
Read ECMA specification to learn inner workings of JavaScript. Also dont forget to check for specific implementations for most popular browser(s) for you.
Implement your own JavaScript engine
Tie HTML and JavaScript engines togehter
Now when you have a way to safely render HTML on server it is an easy task: Get your engines to render page into a bitmap (may also need to implement cutom grapics library) and you are done.
More seriously - use existing tools (make sure they are ok to be used on server - i.e. I would not do it with IE engine). Or if you want to learn some particular part of the stack - scope the rest down (i.e. just render title of the page to bitmap using System.Drawing) to see how components work together.

Related

DISQUS comments in asp.net webforms

Hi i want to use DISQUS comments system. But i couldnt find any resource for C# asp.net webforms, examples , source code or implement. I have found a project in codeplex and CodeProject
but it seems that the code is in MVC i haven't used MVC . Where can i find DISQUS implementation code in Asp.net C# webforms? not MVC.
In most cases implementing Disqus into a website is really easy, since you're not actually building all the markup. At the minimum, you just need to add the Universal Code on the proper page templates, which links to the embed javascript file and the "disqus_thread" DIV.
At that point to make a "complete" integration, you just need to output some javascript configuration variables (using a unique identifier, URL, and title for each thread) and maybe the comment counting script and that should be it.
The only webforms-specific examples you might possibly need are how to output the article's unique ID, title and URL variables onto the page. So if you have a good idea of how to do that, you shouldn't need an existing integration.

How to capture javascript HTTP redirection in c#?

I am working on a utility that track redirections. HTTP redirections are pretty well handled by the following c# methods:
HttpWebRequest
HttpWebResponse
WebHeaderCollection (Location filed)
Next step is to include javascript redirections in a url journey to the final page.
How can I capture this type of redirection in c#?
Are there any other types of redirection that should be taken into consideration?
If you use an embedded javascript engine and create data models that mock up some of the more commonly accessed aspects of the DOM and the javascript apis/prototypes, then you could load the page, execute any and all javascript code and have your window.location property setter fire an event when it gets set, then just follow that url as normal. This allows you to handle computed values as well as your standard
window.location = "/home";
There are no short supply of embedded javascript engines for C#, here are just a few that I find to be really good:
Javascript.Net - Uses Google's V8 engine. Really easy to integrate in an application. Only downside-ish is keeping an unmanaged DLL with your application.
https://github.com/JavascriptNet/Javascript.Net
Jint (Javascript Interpreter for .Net) - Really good. Fully managed code. Again, easy to integrate within an application.
http://jint.codeplex.com/
The real key here is mocking up what is normally created by a browser.
Your only practical option is to use full blown browser to do that. You can either use WebBrowser control for WebForms, or automation of browser (directly for IE, or through something like Selenium or WebAII), or embedding other browser engine (I think WebKit have C# bindings...). Using full blown browser engine will also take care of other redirection mechanisms whatever are there.
Alternative would be to implement at least partial HTML DOM and JavaScript engine which is definitely interesting project...

Clone intranet site, but replace content pane with own content

I'm working at a small company within a rather large company, where I don't really have control over our intranet. I have built a little site/page, and I want it to style exactly like the intranet pages.
I know I can download the stylesheets and start hacking away, but I need the links and the menu's to be up to date.
I'm working with asp.net mvc 2 here, but I've no idea how to go further from here. Thoughts?
You will need to copy the CSS etc.
About the menu - you will need to do the fallowing
use WebRequest for getting the new data, Use Html Agility Pack for parsing the page, And use XPath for getting the relevant data - I will recommend using caching for this

Creating a Chatbox with AJAX, HTML and C#?

I am using the Nancy Web Framework in my C# Console Application to basically create a Web Administration panel for my software. I have opted to use the Spark View Engine, as it is basically just HTML. I basically want to create a chatbox, except pull the data written to my application's console every X seconds and display it in a box instead.
I have very little experience with JQuery and AJAX, but they aren't overly complicated from the examples I have seen. The issue I am running into is that ALL of the chatbox and shoutbox examples use PHP.
I basically just need something like this...
The only difference is I need to pull the information from my application instead. I can use basic C# methods inside of the HTML (and probably inside of javascript but I haven't tried this). What would be the best way to do this, and are there any examples floating around that don't use PHP?
This was completed using AJAX and JSON.
Well, to use HTML for styling inside some PC program is just not wise. It has much better UI engines, though. But for your information here is nice jQuery shoutbox tutorial, but well, you only need to handle data input and output with C#, so actually I see no problems. The engine which you are using should have some kind of data stream, or requests handler (bla://program/???)

Web page crawling in C#

I have been given a task to crawl / parse and index available books on many library web page. I usually use HTML Agility Pack and C# to parse web site content. One of them is the following:
http://bibliotek.kristianstad.se/pls/bookit/pkg_www_misc.print_index?in_language_id=en_GB
If you search for a * (all books) it will return many lists of books, paginated by 10 books per page.
Typical web crawlers that I have found fail on this website. I have also tried to write my own crawler, which would go through all links on the page and generate post/get variables to dynamically generate results. I havent been able to do this as well, mostly due to some 404 errors that I get (although I am certain that the links generated are correct).
The site relies on javascript to generate content, and uses a mixed mode of GET and POST variable submission.
I'm going out on a limb, but try observing the JavaScript GETs and POSTs with Fiddler and then you can base your crawling off of those requests. Fiddler has FiddlerCore, which you can put in your own C# project. Using this, you could monitor requests made in the WebBrowser control and then save them for crawling or whatever, later.
Going down the C# JavaScript interpreter route sounds like the 'more correct' way of doing this, but I wager it will be much harder and frought with errors and bugs unless you have the simplest of cases.
Good luck.
FWIW, the C# WebBrowser control is very, very slow. It also doesn't support more than two simultaneous requests.
Using SHDocVw is faster, but is also semaphore limited.
Faster still is using MSHTML. Working code here: https://svn.arachnode.net/svn/arachnodenet/trunk/Renderer/HtmlRenderer.cs Username/Password: Public (doesn't have the request/rendering limitations that the other two have when run out of process...)
This is headless, so none of the controls are rendered. (Faster).
Thanks,
Mike
If you use the WebBrowser control in a Windows Forms application to open the page then you should be able to access the DOM through the HtmlDocument. That would work for the HTML links.
As for the links that are generated through Javascript, you might look at the ObjectForScripting property which should allow you to interface with the HTML page through Javascript. The rest then becomes a Javascript problem, but it should (in theory) be solvable. I haven't tried this so I can't say.
If the site generates content with JavaScript, then you are out of luck. You need a full JavaScript engine usable in C# so that you can actually execute the scripts and capture the output they generate.
Take a look at this question: Embedding JavaScript engine into .NET -- but know that it will take "serious" effort to do what you need.
AbotX does javascript rendering for you. Its not free though.

Categories

Resources