Caching asp.net pages - c#

I am bulding an app and I tried YSlow and got Grade F on most of my practices. I have loads of jscript that am working on reducing. I want to be able to cache some of these because the pages get called many times.
I have one master age and I wanted to cache the scripts and css files.
How do I achieve this?
Are there any recommended best practices?
Are there any other performance improvements that I can make?

Have you re-read RFC 2616 yet this year? If not, do. Trying to build websites without a strong familiarity with HTTP is like trying to seduce someone when you're extremely drunk; just because lots of other people do it doesn't mean you'll have good performance.
If a resource can be safely reused within a given time period (e.g safe for the next hour/day/month) say so. Use the max-age component of the cache-control header as well as expires (max-age is better than expires, but doing both costs nothing).
If you know the time something last changed, say so in a Last-Modified header (see note below).
If you don't know when something last changed, but can add the ability to know, do so (e.g. timestamp database rows on UPDATE).
If you can keep a record of every time something changed do so, and build an e-tag from it. While E-tags should not be based on times an exception is if you know they can't change in a finer resolution (time to nearest .5 second is fine if you can't have more than 1 change every .5 second, etc.)
If you receive a request with a If-Modified-Since with a date matching last change time or a If-None-Match matching the e-tag, send a 304 instead of the whole page.
Use Gzip or Deflate compression (deflate is slightly better when client says it can handle both) but do note that you must change the e-tag. Sending correct Vary header for this breaks IE caching, so Vary on User-Agent instead (imperfect solution for an imperfect world). If you roll your own compression in .NET note that flushing the compression stream causes bugs, write a wrapper that only flushes the output on Flush() prior to the final flush on Close().
Don't defeat the caching done for you. Turning off e-tags on static files gives you a better YSlow rating and worse performance (except on web-farms, when the more complicated solution recommended by YSlow should be used). Ignore what YSlow says about turning off e-tags (maybe they've fixed that bug now and don't say it any more) unless you are on a web-farm where different server types can deal with the same request (e.g. IIS and Apache dealing with the same URI; Yahoo are which is why this worked for them, most people aren't).
Favour public over private unless inapproprate.
Avoid doing anything that depends on sessions. If you can turn off sessions, so much the better.
Avoid sending large amounts of viewstate. If you can do something without viewstate, so much the better.
Go into IIS and look at the HTTP Headers section. Set appropriate values for static files. Note that this can be done on a per-site, per-directory and per-file basis.
If you have a truly massive file (.js, .css) then give it a version number and put that version in the URI used to access it (blah.js/?version=1.1.2). Then you can set a really long expiry date (1 year) and/or a hard-coded e-tag and not worry about cache staleness as you will change the version number next time and to the rest of the web it's a new resource rather than an updated one.
Edit:
I said "see note below" and didn't add the note.
The last modified time of any resource, is the most recent of:
Anything (script, code-behind) used to create the entity sent.
Anything used as part of it.
Anything that was used as part of it, that has now been deleted.
Of these, number 3 can be the trickiest to work out, since it has after all been deleted. One solution is to keep track of changes of the resource itself, and update this on deletion of anything used to create it, the other is to have a "soft delete" where you have the item still, but marked as deleted and not used in any other way. Just what the best way to track this stuff is, depends on the application.

You should just create separate .js and .css files and the browser does the caching for you. It is also a good idea to use a js-minimizer that removes all the white space from the js-files.
If you have a huge ViewState like > 100Kb try to reduce it as well. If the ViewState is still huge, you can store the ViewState on the server as a file...
http://aspalliance.com/472
You might also use the caching on the page if the page is not too dynamic...
http://msdn.microsoft.com/en-us/library/06bh14hk.aspx

You can also reference common js and css libraries to trusted online stores. For example, if you add jquery as <script src="http://code.jquery.com/jquery-latest.js"></script> the jquery file has probably been cached by the browser of client, because of another web site that references this address before, even if it is cached because of your web site.
This way may have pros and cons but there is such a way.
Also I don't know if response of YSlow changes with this way.

Related

RSS Feader Update Interval

I have a feed reader running every minute (it's picking up a feed that gets updated often). But I seem to be running into getting blocked by Akamai when accessing a few websites. Perhaps they think I'm up to something, but I'm not - I just want to get the feed.
Any thoughts on how to either play nice with Akamai or code this differently? From what I know, I can't know when the feed is updated other than polling it - but is there a preferred way - like checking a cache? This is coded in c# though I doubt that makes a difference.
Without more of a context it is hard to ascertain why you are being blocked. Is it because of rate limits or other access control measures?
Assuming it is rate limits, there is not much you can do. I would recommend you to first verify that the robots.txt allows you to crawl the URL and if allowed use some sort of exponential back off. Helps to play nice by providing a meaningful User-Agent so that when they do update their rules they might want to consider whitelisting legitimate requests such as yourself.

What would be the "cheapest" way to check for updates periodically?

Basically I'm writing an ASP.NET MVC application that though a javascript sends a GET request every 30 seconds, checking if a certain row in a table in the database has changed.
I've been looking at the OutputCache attribute but it doesn't seem like it would work since it would just cache the content and not really check if an update was made.
What would be the "cheapest" way to do this? I mean the way that burdens the server the least?
A HEAD request may be faster, but not guaranteed to be, but it is worth investigating.
If you can't use something to stream the change to you, the cheapest way is to use an API that takes a date, and returns a boolean flag or an integer stating whether a change occurred. Essentially it's polling, and this would be minimal because it's the smallest response back and forth, if a SignalR or some other message receive process isn't possible.
Depends what you want it to do, have you considered long-polling? Eg, make the GET/POST request using javascript and allow the server to withhold the reply until your 'event' happens.
OutputCache works perfectly. But it's expiration time should be a divider of your polling time, like 10 sec for - in this case - 30 sec on client size.
I'm not an expert on EF, but if your database supports triggers; that would be an option and you can cache the result for a longer period (like say 1 hour) unless a trigger is set.
But if your record is being updated very fast, trigger would be costly.
In that case I would go with Caching + a time-stamp mechanism (like versions in a NoSQL db, or time-stamp in Oracle).
And remember that you are fetching the record every 30 seconds, not on every change on the record. That's a good thing, because it makes your solution much simpler.
Probably SignalR with push notification when there's a change in the database (and that could be either tracked manually or by SqlDependency, depending on the database)...
I used Wyatt Barnett's suggestion and it works great.
Thanks for the answers - appreciate it.
Btw I'm answering it since I can't mark his comment as answer.

Solution for popularity-based caching

I have a C# application that handles about 10.000 immutable objects, each object is of 50KB - 1MB size.
The application picks about 10-100 objects for every operation. Which objects are picked depend on circumstances and user choices, but there are a few that are very frequently used.
Keeping all object in memory all time, is way too much, but disk access time is pressing. I would like to use a popularity-based cache to reduce disk activity. The cache would contain max. 300 objects. I expect that during the usage patterns decides which one should be cached. I can easily add an access counter to each object. The more popular ones get in, less popular ones have to leave the cache. Is there an easy, ingenious way to do that without coding my butt off?
Well you can use System.Runtime.Caching. Cache the objects, which are constantly used, if the cahced objects changes after some time you can specify how much time the cache is Valid. Once the cahce is invalid, in the event handler you can rebuild the cache.
Make sure you use some thread synchronization mechanism when rebuilding cache.
I'd go with WeakReferences: you can build a simple cache manager on top of that in a couple of mins, and let .NET handles the actual memory management by itself.
It may not be the best solution if you need to limit the amount of memory you want your program to use, but otherwise it's definitely worth checking out.
One ready-made solution is to use ASP.NET caching's sliding expiration window.
Sounds like a job for MemCached! This is a free, open-source, high-performance and flexible caching solution. You can download it at http://www.memcached.org.
To get a broad overview, look at the Wikipedia page at https://en.wikipedia.org/wiki/Memcached.
Good luck!

ASP.NET caching w/file dependency: static var vs. AspNet cache vs. memcached

TL;DR: Which is likely faster: accessing static local variable, accessing variable stored in HttpRuntime.Cache, or accessing variable stored in memcached?
At work, we get about 200,000 page views/day. On our homepage, we display a promotion. This promotion is different for different users, based on their country of origin and language.
All the different promotions are defined in an XML file on each web server. We have 12 web servers all serving the same site with the same XML file. There are about 50 different promotion combinations based on country/language. We imagine we'll never have more than 200 or so (if ever) promotions (combinations) total.
The XML file may be changed at any time, out of release cycle. When it's changed, the new definitions of promotions should immediately change on the live site. Implementing the functionality for this requirement is the responsibility of another developer and I.
Originally, I wrote the code so that the contents of the XML file were parsed and then stored in a static member of a class. A FileSystemWatcher monitored changes to the file, and whenever the file was changed, the XML would be reloaded/reparsed and the static member would be updated with the new contents. Seemed like a solid, simple solution to keeping the in-memory dictionary of promotions current with the XML file. (Each server doing this indepedently with its local copy of the XML file; all XML files are the same and change at the same time.)
The other developer I was working holds a Sr. position and decided that this was no good. Instead, we should store all the promotions in each server's HttpContext.Current.Cache with a CacheDependency file dependency that automatically monitored file changes, expunging the cached promotions when the file changed. While I liked that we no longer had to use a FileSystemWatcher, I worried a little that grabbing the promotions from the volitile cache instead of a static class member would be less performant.
(Care to comment on this concern? I already gave up trying to advocate not switching to HttpRuntime.Cache.)
Later, after we began using HttpRuntime.Cache, we adopted memcached with Enyim as our .NET interface for other business problems (e.g. search results). When we did that, this Sr. Developer decided we should be using memcached instead of the HttpRuntime (HttpContext) Cache for storing promotions. Higher-ups said "yeah, sounds good", and gave him a dedicated server with memcached just for these promotions. Now he's currently implementing the changes to use memcached instead.
I'm skeptical that this is a good decision. Instead of staying in-process and grabbing this promotion data from the HttpRuntime.Cache, we're now opening a socket to a network memcached server and transmitting its value to our web server.
This has to be less performant, right? Even if the cache is memcached. (I haven't had the chance to compile any performance metrics yet.)
On top of that, he's going to have to engineer his own file dependency solution over memcached since it doesn't provide such a facility.
Wouldn't my original design be best? Does this strike you as overengineering? Is HttpRuntime.Cache caching or memcached caching even necessary?
Not knowing exactly how much data you are talking about (assuming it's not a lot), I tend to somewhat agree with you; raw-speed wise, a static member should be the 'fastest', then Cache. That doesn't necessarily mean it's the best option, of course. Scalability is not always about speed. In fact, the things we do for scalability often negatively (marginally) affect the speed of an application.
More specifically; I do tend to start with the Cache object myself, unless a bit of 'static' data is pretty darn small and is pretty much guaranteed to be needed constantly (in which case I go for static members. Don't forget thread synch too, of course!)
With a modest amount of data that won't change often at all, and can easily be modified when you need to, by altering the files as you note, the Cache object is probably a good solution. memcached may be overkill, and overly complex... but it should work, too.
I think the major possible 'negative' to the memcached solution is the single-point-of-failure issue; Using the local server's Cache keeps each server isolated.
It sounds like there may not really be any choice in your case, politically speaking. But I think your reasoning isn't necessarily all that bad, given what you've shared here.
Very much agree with Andrew here. Few additions/deviations:
For small amount of rarely changing data, static fields would offer best performance. When your caching happens at no UI layer, it avoids taking dependency on System.Web assembly (of course, you can achieve this by other means as well as). However, in general, ASP.NET Cache would also be a good bet (especially when data is large, the cached data can expire if there is memory pressure etc.)
From both speed & scalability, output caching (including browser & down level caching) would be the best option and you should evaluate it. Even if data is changing frequently, output caching for 30-60 seconds can give significant performance boost for very large number of requests. If needed, you can do partial caching (user controls) and/or substitutions. Of course, this needs to be done with combination with data caching.

Long-term Static Page Caching

I maintain several client sites that have no dynamic data whatsoever, everything is static asp.net with c#.
Are there any pitfalls to caching the entire page for extreme periods of time, like a week?
Kibbee, We use a couple controls on the sites (ad rotator, some of the ajax extensions) on the sites. They could probably be completely written in html but for convenience sake I just stuck with what we use for every other site.
The only significant pitfall to long cache times occurs when you want to update that data. To be safe, you have to assume that it will take up to a week for the new version to become available. Intermediate hosts such as a ISP level proxy servers often do cache aggressively so this delay will happen.
If there are large files to be cached, I'd look at ensuring your content engine supports If-Modified-Since.
For smaller files (page content, CSS, images, etc), where reducing the number of round-trips is the key, having a long expiry time (a year?) and changing the URL when the content changes is the best. This lets you control when user agents will fetch the new content.
Yahoo! have published a two part article on reducing HTTP requests and browser cache usage. I won't repeat it all here, but these are good reads which will guide you on what to do.
My feeling is to pick a time period high enough to cover most users single sessions but low enough to not cause too much inconvenience should you wish to update the content. Be sure to support If-Modified-Since if you have a Last-Modified for all your content.
Finally, if your content is cacheable at all and you need to push new content out now, you can always use a new URL. This final cachable content URL can sit behind a fixed HTTP 302 redirect URL should you wish to publish a permanent link to the latest version.
We have a similar issue on a project I am working on. There is data that is pretty much static, but is open to change..
What I ended up doing is saving the data to a local file and then monitoring it for changes. The DB server is then never hit unless we remove the file, in which case it will scoot of to the DB and regenerate the data file.
So what we basically have a little bit of disk IO while loading/saving, no traffic to the DB server unless necessary and we are still in control of it (we can either delete manually or script it etc).
I should also add is that you could then tie this up with the actual web server caching model if you wanted to reduce the disk IO (we didnt really need to in our case)..
This could be totally the wrong way to go about it, but it seems to work quite nice for us :)
If it's static, why bother caching at all? Let IIS worry about it.
When you say that you have no data, how are you even using asp.net or c#. What functionality does that provide you over plain HTML? Also, if you do plan on caching, it's probably best to cache to a file, and then when a request is made, stream out the file. The OS will take care of keeping the file in memory so that you won't have to read it off the disk all the time.
You may want to build in a cache updating mechanism if you want to do this, just to make sure you can clear the cache if you need to do a code update. Other than that, there aren't any problems that I can think of.
If it is static you would probably be better off generating the pages once and then serve up the resulting static HTML file directly.

Categories

Resources