ASPX pages render fast with FF & Chrome but slow on IE - c#

I have a page with a few big tables. When loading this page or triggering an event is fast enough with Chrome but when I run this in IE7 the page is slow.
Sometimes if I click a button it takes a few seconds before it is loaded instead of instant action with Chrome or FF.
I Googled a around a bit to find an solution to this problem and I tried the HTML validator. If I save a page in HTML format and insert it in the validator I get 1K+ errors, most of these errors are tags that are not closed.
If I check the ASP code, which is very limited because all the code is written dynamically with objects (I didn't write my own HTML code), all my tags are closed and I don't get a single warning or error in Visual Studio.
In this page I use jQuery and some custom JavaScript (nothing to complex).
All my data comes from SQL server, If I ran all the queries at once it's still less then one second, pretty sure these queries are written as best as possible.
Any idea how I can make the website faster in IE?
(Unfortunately 90% of the users have IE7)

I would recommend that you install the plugin yslow on firefox and check what kind of score the plugin gives your site and what recomendations does it give to optimize the site.
Also, you should know that IE 6-8 is extremely slow at compiling javascript and at DOM manipulation. The crudest way of identifying javascript slow downs I know of, is to simply comment out javascript functions from your page, one by one, until the site starts loading fast. Then you work on optimizing whatever function you think loads slowly.

Without seeing any code, it's hard to assert why these performance issues arise. One thing I can think of is how jQuery works in IE7
Simply put, when you are using a selector in jQuery (like $(".some-class")) jQuery will use the native function document.querySelectorAll, which queries the DOM using CSS selectors (unless you're using jQuery-specific selectors like :animated). However, IE7 does not have an implementation for the querySelectorAll method, which causes jQuery to search the DOM in a more iterative way. I'm not entirely sure how this works, but I'm sure one can find out at sizzlejs.org
Now if you have a very large HTML document in IE7, and you are, for instance, attaching events to each row in your table like so: $(".some-class-that-marks-as-clickable").click(...), jQuery will have to look for all these rows and apply the handler. If this is the case, it can easily be remedied by using the onclick attribute on each clickable element instead.
Of course, since you have not posted any code I cannot guarantee that this is your problem. I only know I had that exact problem a few years back, which caused IE7 to render the page in ~45 seconds, while Firefox did in less than one second.

Related

Excecute script with HtmlAgilityPack [duplicate]

I'm trying to scrape a particular webpage which works as follows.
First the page loads, then it runs some sort of javascript to fetch the data it needs to populate the page. I'm interested in that data.
If I Get the page with HtmlAgilityPack - the script doesn't run so I get what it essentially a mostly-blank page.
Is there a way to force it to run a script, so I can get the data?
You are getting what the server is returning - the same as a web browser. A web browser, of course, then runs the scripts. Html Agility Pack is an HTML parser only - it has no way to interpret the javascript or bind it to its internal representation of the document. If you wanted to run the script you would need a web browser. The perfect answer to your problem would be a complete "headless" web browser. That is something that incorporates an HTML parser, a javascript interpreter, and a model that simulates the browser DOM, all working together. Basically, that's a web browser, except without the rendering part of it. At this time there isn't such a thing that works entirely within the .NET environment.
Your best bet is to use a WebBrowser control and actually load and run the page in Internet Explorer under programmatic control. This won't be fast or pretty, but it will do what you need to do.
Also see my answer to a similar question: Load a DOM and Execute javascript, server side, with .Net which discusses the available technology in .NET to do this. Most of the pieces exist right now but just aren't quite there yet or haven't been integrated in the right way, unfortunately.
You can use Awesomium for this, http://www.awesomium.com/. It works fairly well but has no support for x64 and is not thread safe. I'm using it to scan some web sites 24x7 and it's running fine for at least a couple of days in a row but then it usually crashes.

Low Bandwidth Website Design

A little while back, one of the junior developers at our company was tasked with creating a website for users to enter timesheets offsite. Mostly this is used for staff that reside offshore and have limited bandwidth (it's satellite internet, so we're already looking at a 500ms - 600ms response time, typically with only 10KB/s or less, including 10% - 20% intermittent packet loss).
So it's a challenging situation...
Recently I've been tasked with helping the junior to improve the speed and functionality of the website, mostly for my own benefit, since I'm usually a desktop dev. One thing I've noticed is that the website is using MultiView and I'm wondering if that's the best approach. I can see the reasoning; download the entire website once, then just make queries back and forth, showing/hiding the various views as necessary. Except it doesn't seem to work as smoothly as that.
95% of operations required a run by the server; i.e. add a new timesheet - need to tell the server, which in turn creates a new entry in the database. When the server is done, it seems to cause the client to download the entire webpage again, which is obviously counter productive.
So my question(s) are as follows;
Is this the expected behaviour, given the above situation? i.e. Should the entire webpage be getting re-downloaded once the server has completed it's actions?
If so, is this the best approach for the situation? Would it be better to have smaller, individual pages for the various features (timesheets/leave/etc.)?
I know this is probably a bit opinion based, but any ideas or assistance is greatly appreciated; for both our benefits.
Going from memory, Multiview only renders one of the views, not all of them, but since you mention the Multiview, that tells me you are using the older WebForms technology which often carries large amounts of overhead saving/restoring state. You can try and optimize that, especially if you are using some kind of grid control.
A better approach may be to ditch WebForms and switch to a newer technology like MVC. Rewrite the application to use AJAX with a webservice that returns JSON whenever possible to reduce the amount of data that needs to be sent to and from the server. Using MVC will also reduce the number of resources required for a page load (No resource.axd, etc) which will help page load times, especially over high latency links.
Make sure the server is set to compress dynamic pages with GZIP.
Compress and minify your javascript and CSS.
Don't use inline styles (the style attribute) in your HTML (use classes or IDs+children selectors) to reduce HTMLsize.
Bundle all your javascript and CSS.
Sprite your images in CSS where possible.
Run your images through a good image optimizer like http://kraken.io
Make sure you are caching whatever you can, and the cache duration is set properly.
Minify your HTML.
Stop using WebForms (or watch your page state, and control state very closely)
Check into some of the SPA architectures out there -- you may be able to make the whole application "offline-able" with the exception of the calls to get/update/create data.
Ultimately, each page should only require 1 HTML file, 1 CSS file, 1 Javascript file, and 1 sprite sheet on the first page hit, and then every page after that should only require a single HTML file.
You might also want to look into using a client side library like angular or knockout to handle rendering views. This can reduce the amount of traffic that needs to be sent (although it likely will increase the number of requests by one).
I think the best bet is a SPA (Single Page App) with Angularjs. Done right it greatly reduces the number of http requests. Navigation does not cause entire page reload in any case. Javascript files, css files etc, are loaded just one time at app load time. Once the app is loaded in the browser, the traffic is mainly sending JSON back and forth.
There are some tricks you should apply to reduce app load time:
Bundle javascript files into just one minified javascript file.
Bundle css files into just one css file.
Levearage http cache. You can use file versioning combined with MaxAge http header, so the browser does not even ask the server if the file has changed.
Some tools to help:
Fiddler, look at what is being cached and what isn't.
Facebooks augmented Traffic Control
To my understanding, ajax would be the best choice for you. If you want to access server 95% of times and reload the page with the new elements then the performance would hamper.
So instead of doing this make partial reloading with Ajax or Jquery. There are plenty of functionality available with jquery which would use ajax and reload specific portion of the webpage instead of whole page. It would increse the performance a lot.
One more thing I would like to add is that the response packet coming from server might be huge chunk. So instead of directly throwing the response from the server, implement GZip functionality in the website. It would compress the size of the data packet and the page would load/reload much faster.
Other than these, place your CSS and JS code inside some .css and .js file instead of placing it inside the page itself(and make sure to use it maximum time from all the pages). Browser would make a cache version of those files and reuse it instead of download it every time you want to connect to the server.
I believe that you have already figured out what's wrong. No Multiview is not good if it is implemented as is without tweaks. If your website uses viewstate and on top of that you have the multiview implemented, then it is going to be a costly affair.
Here are your options.
To use most out of the code, I would recommend to convert your methods HTTP GET / POST methods which can be then called separately from the needed actions in the html.
Don't re-render the entire page, but render the content which changes on menu action.
Change the non-changing part of your page / site to static content and apply compression on the static contents.
Enable page caching.
Cache the data offline wherever possible. (Remember it comes with a overhead of syncing data).
If you are considering a revamp give a thought about HTML 5 offline features.

How can I scrape a page that contains data updated with JavaScript after page load?

Im trying to scrape a page. Everything is ok, but when values are updated, the sourse code of page is still the same for a minute. Even when i refresh a page with slow internet connection, first i see old data, and only after page gets fully loaded values are current.
I guess javascript updates them. BUt it still has to download them somehow.
How can i get current values?
I write my program in C#, but if you have some ideas/advices/examples language doesnt really matter.
Thank you.
You're right - javascript is probably updating the data after load.
I could think of three ways to handle this:
Use a webbrowser control - I guess your using the HttpWebRequest object to retrieve values from the site. This won't work if you need to let the javascript to run. You can use the webbrowser control, let the javascript run and retrieve values from the DOM. Only thing I don't like about this approach is it feels like a hack and probably too clunky for prod applications. You also need to know when to read the contents of the DOM (an update might be inprogress in the background). Google "C# WebBrowser Control Read DOM Programmatically" or you can read more about that here.
I personally prefer this over the previous but it doesn't work all the time. First you need to inspect the website from firebug or something and see which urls are called from the background. Say for example the site is updating stock quotes using javascript. Most likely, its using an asynchronous request to retrieving the updated information from a webservice. Using firebug, you can view this under NET>XHR. Now is the hard part. Well, take a look at the request and the values returned. The idea is, you can try to retrieve the values your self and parse the contents - which can be a lot easier than scraping a page. The problem is, you would need to do a bit of reverse engineering to get it right. You might also encounter problems with authentication and/or encryption.
Lastly and my most preferred solution is asking the owner [of the site your are scraping] directly.
I think the WebBrowser control approach is probably OK and doesn't depend on third party libraries. Here is what I intend to use and it solves the problem of waiting for the page to complete loading:
private string ReadPage(string Link)
{
using (var client = new WebClient())
{
this.wbrwPages.Navigate(Link);
while (this.wbrwPages.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
ReadPage = this.wbrwPages.DocumentText;
}
}
I will get information out of the HTML through some form of DOM or XPath treatment. I am curious if others will have comments about entering the 'while' loop and depending upon the 'complete' state to get me out of it. I may put a timer of some sort in there as well - just to be safe.

Using JavaScript template with ASP.NET

I ran into this problem multiple times in my career, and never was able to find a elegant solution for it. Imagine you have a simple page, that has a repeater. You populate that repeater on the server-side through the databinding. That's great, works fast and does what it's supposed to. But now you want to add paginator to that repeater, or otherwise change the output. Doing it through Ajax is a preferred way to enable rich client interaction.
So you create a web-service that serves you the data as JSON, but now you are stuck... Either you have to write complicated client-side code to find each field that you need to modify in each repeater-item, or you have to blow away the whole server-side output of the repeater and construct new HTML from the scratch, or, the method that I've been using lately, take the first repeated item, blow away everything else and clone the first item as many time as you need to and modify it's fields.
All of the described methods are not optimal, because no matter what, they require quite a bit of repeated logic on the server-side (i.e. template in repeater) and on the client-side (javascript to display JSON data). There's got to be a better, easier way to do this. First thing that comes to mind, is instead of returning JSON from the web-server, return HTML of the pre-populated repeater. But for something like that, I might as well use ASP.NET AJAX Update panel. The output isn't going to be any smaller with a stand-alone web-service.
Next thing that I thought of, is JavaScript templates. What if there would be some way to take unprocessed repeater template on the server-side, and convert it to JavaScript template that could be either embedded on the page at load, or served as part of the web-service response. However, I couldn't find any existing solutions for something like this. And I can't think of a simple way to do that myself. Any ideas?
P.S. Rendering JavaScript template to the client-side on page load, and using JavaScript to populate it without the initial view being rendered on the server (no repeater and databinding) is out of the question. I care too much about performance.
Firstly, I don't believe that using client template with JSON data even on first load will adversely affect the performance unless we are talking about devices with different form factors such as phones etc.
However, if you must use server side templating/rendering then why not make server return the html for the repeater. This can be done by putting repeater logic into a different user control/page and processing only that page on ajax request. And this is not at all equivalent to using UpdatePanel (as stated by you) - UpdatePanel posts entire page data (including view-state) having more request size. The response size is also larges because it must contain the view-state. On server side also, use of UpdatePanel results in loading complete control tree with state data and post-back event processing. Sending only the requisite html is much better approach and will fit your needs perfectly - only issue is the html would be larger in size as compared to JSON.
Lastly, there are some interesting projects such as Script# - Script# converts C# code into java-script. You may build something similar (using script# itself) to convert the server side templating code into eqivalent JS code. More viable approach on similar lines could be use T4 templating to convert a technology-agnostic template into both server side code (markup + code or pure code) and equivalent JS-code.
After thinking about all pros and cons of different approaches, I stopped on the following method. I created a custom ASP.NET databound control, that can render HTML, however, when the page is requested with query string parameters, instead of just doing standard rendering, it will use Response.Clear() and Response.End() and in between of those two commands output JSON version of data based on the query string parameters. Also on the first rendering of the page, it will also output JavaScript template using reflections to read names of the variables from the control's template area.
This method works great, all I have to do, is drop my control on the page, data bind it, and it works as a true AJAX grid that supports pagination, sorting and filtering. However it does have limitation. In the control's template you can only specify variables, not expressions. Otherwise reflections can't convert it to a JavaScript variable. But I can live with that.
Other possibilities that I considered is a separate web-service that takes a type of the page as parameter and uses reflection to get data bound object as well as create template for the grid. I also though about writting my own version of update panel, that would not use view state and only send in part of the page.

Web page crawling in C#

I have been given a task to crawl / parse and index available books on many library web page. I usually use HTML Agility Pack and C# to parse web site content. One of them is the following:
http://bibliotek.kristianstad.se/pls/bookit/pkg_www_misc.print_index?in_language_id=en_GB
If you search for a * (all books) it will return many lists of books, paginated by 10 books per page.
Typical web crawlers that I have found fail on this website. I have also tried to write my own crawler, which would go through all links on the page and generate post/get variables to dynamically generate results. I havent been able to do this as well, mostly due to some 404 errors that I get (although I am certain that the links generated are correct).
The site relies on javascript to generate content, and uses a mixed mode of GET and POST variable submission.
I'm going out on a limb, but try observing the JavaScript GETs and POSTs with Fiddler and then you can base your crawling off of those requests. Fiddler has FiddlerCore, which you can put in your own C# project. Using this, you could monitor requests made in the WebBrowser control and then save them for crawling or whatever, later.
Going down the C# JavaScript interpreter route sounds like the 'more correct' way of doing this, but I wager it will be much harder and frought with errors and bugs unless you have the simplest of cases.
Good luck.
FWIW, the C# WebBrowser control is very, very slow. It also doesn't support more than two simultaneous requests.
Using SHDocVw is faster, but is also semaphore limited.
Faster still is using MSHTML. Working code here: https://svn.arachnode.net/svn/arachnodenet/trunk/Renderer/HtmlRenderer.cs Username/Password: Public (doesn't have the request/rendering limitations that the other two have when run out of process...)
This is headless, so none of the controls are rendered. (Faster).
Thanks,
Mike
If you use the WebBrowser control in a Windows Forms application to open the page then you should be able to access the DOM through the HtmlDocument. That would work for the HTML links.
As for the links that are generated through Javascript, you might look at the ObjectForScripting property which should allow you to interface with the HTML page through Javascript. The rest then becomes a Javascript problem, but it should (in theory) be solvable. I haven't tried this so I can't say.
If the site generates content with JavaScript, then you are out of luck. You need a full JavaScript engine usable in C# so that you can actually execute the scripts and capture the output they generate.
Take a look at this question: Embedding JavaScript engine into .NET -- but know that it will take "serious" effort to do what you need.
AbotX does javascript rendering for you. Its not free though.

Categories

Resources