Long-term Static Page Caching - c#

I maintain several client sites that have no dynamic data whatsoever, everything is static asp.net with c#.
Are there any pitfalls to caching the entire page for extreme periods of time, like a week?
Kibbee, We use a couple controls on the sites (ad rotator, some of the ajax extensions) on the sites. They could probably be completely written in html but for convenience sake I just stuck with what we use for every other site.

The only significant pitfall to long cache times occurs when you want to update that data. To be safe, you have to assume that it will take up to a week for the new version to become available. Intermediate hosts such as a ISP level proxy servers often do cache aggressively so this delay will happen.
If there are large files to be cached, I'd look at ensuring your content engine supports If-Modified-Since.
For smaller files (page content, CSS, images, etc), where reducing the number of round-trips is the key, having a long expiry time (a year?) and changing the URL when the content changes is the best. This lets you control when user agents will fetch the new content.
Yahoo! have published a two part article on reducing HTTP requests and browser cache usage. I won't repeat it all here, but these are good reads which will guide you on what to do.
My feeling is to pick a time period high enough to cover most users single sessions but low enough to not cause too much inconvenience should you wish to update the content. Be sure to support If-Modified-Since if you have a Last-Modified for all your content.
Finally, if your content is cacheable at all and you need to push new content out now, you can always use a new URL. This final cachable content URL can sit behind a fixed HTTP 302 redirect URL should you wish to publish a permanent link to the latest version.

We have a similar issue on a project I am working on. There is data that is pretty much static, but is open to change..
What I ended up doing is saving the data to a local file and then monitoring it for changes. The DB server is then never hit unless we remove the file, in which case it will scoot of to the DB and regenerate the data file.
So what we basically have a little bit of disk IO while loading/saving, no traffic to the DB server unless necessary and we are still in control of it (we can either delete manually or script it etc).
I should also add is that you could then tie this up with the actual web server caching model if you wanted to reduce the disk IO (we didnt really need to in our case)..
This could be totally the wrong way to go about it, but it seems to work quite nice for us :)

If it's static, why bother caching at all? Let IIS worry about it.

When you say that you have no data, how are you even using asp.net or c#. What functionality does that provide you over plain HTML? Also, if you do plan on caching, it's probably best to cache to a file, and then when a request is made, stream out the file. The OS will take care of keeping the file in memory so that you won't have to read it off the disk all the time.

You may want to build in a cache updating mechanism if you want to do this, just to make sure you can clear the cache if you need to do a code update. Other than that, there aren't any problems that I can think of.

If it is static you would probably be better off generating the pages once and then serve up the resulting static HTML file directly.

Related

Best way to create disc cache for web service

I have created a webservice that delivers images. It will always be one-way communication. The images will never be changed, on the side that gets them from the service.
It has multiple sources, and some can be far away, on bad connections.
I have created a memory cache for it, but I would like to also have a disc cache, to store images for longer periods.
I am a bit unsure on the best approach to do this.
First of all, all of my sources are webservers, so I don't really know how to check the last modified date (as an example) of my images, which I would like to use, to see if the file has changed.
Second, how do I best store my local cache? Just drop the files in a folder and compare dates with the original source?
Or, perhaps store all the timestamps in a txt file, with all the images, to avoid checking files.
OR, maybe store them in a local SQL express DB?
The images, in general, are not very large. Most are around 200kb. Every now and then, however, there will be 7+ mb.
The big problem is, that some of the locations, where the service will be hosted, are on really bad connections, and they will need to use the same image, many times.
There are no real performance requirements, I just want to make it as responsive as possible, for the locations that have a horrible connection, to our central servers.
I can't install any "real" cache systems. It has to be something I can handle in my code.
Why don't you install a proxy server on your server, and access all the remote web-servers through that? The proxy server will take care of caching for you.
EDIT: Since you can't install anything and don't have a database available, I'm afraid you're stuck with implementing the disk cache yourself.
The good news is - it's relatively easy. You need to pick a folder and place your image files there. And you need a unique mapping between your image identification and a file name. If your image IDs are numbers, the mapping is very simple...
When you receive a request for an image, first check for it on the disk. If it's there, you have it already. If not , download it from the remote server and store it there, then serve it from there.
You'll need to take concurrent requests into account. Make sure writing the files to disk is a relatively brief process (you can write them once you finish downloading them). When you write the file to disk, make sure nobody can open it for reading, that way you avoid sending incomplete files.
Now you just need to handle the case where the file isn't in your cache, and two requests for it are received at once. If performance isn't a real issue, just download it twice.

ASP.NET caching w/file dependency: static var vs. AspNet cache vs. memcached

TL;DR: Which is likely faster: accessing static local variable, accessing variable stored in HttpRuntime.Cache, or accessing variable stored in memcached?
At work, we get about 200,000 page views/day. On our homepage, we display a promotion. This promotion is different for different users, based on their country of origin and language.
All the different promotions are defined in an XML file on each web server. We have 12 web servers all serving the same site with the same XML file. There are about 50 different promotion combinations based on country/language. We imagine we'll never have more than 200 or so (if ever) promotions (combinations) total.
The XML file may be changed at any time, out of release cycle. When it's changed, the new definitions of promotions should immediately change on the live site. Implementing the functionality for this requirement is the responsibility of another developer and I.
Originally, I wrote the code so that the contents of the XML file were parsed and then stored in a static member of a class. A FileSystemWatcher monitored changes to the file, and whenever the file was changed, the XML would be reloaded/reparsed and the static member would be updated with the new contents. Seemed like a solid, simple solution to keeping the in-memory dictionary of promotions current with the XML file. (Each server doing this indepedently with its local copy of the XML file; all XML files are the same and change at the same time.)
The other developer I was working holds a Sr. position and decided that this was no good. Instead, we should store all the promotions in each server's HttpContext.Current.Cache with a CacheDependency file dependency that automatically monitored file changes, expunging the cached promotions when the file changed. While I liked that we no longer had to use a FileSystemWatcher, I worried a little that grabbing the promotions from the volitile cache instead of a static class member would be less performant.
(Care to comment on this concern? I already gave up trying to advocate not switching to HttpRuntime.Cache.)
Later, after we began using HttpRuntime.Cache, we adopted memcached with Enyim as our .NET interface for other business problems (e.g. search results). When we did that, this Sr. Developer decided we should be using memcached instead of the HttpRuntime (HttpContext) Cache for storing promotions. Higher-ups said "yeah, sounds good", and gave him a dedicated server with memcached just for these promotions. Now he's currently implementing the changes to use memcached instead.
I'm skeptical that this is a good decision. Instead of staying in-process and grabbing this promotion data from the HttpRuntime.Cache, we're now opening a socket to a network memcached server and transmitting its value to our web server.
This has to be less performant, right? Even if the cache is memcached. (I haven't had the chance to compile any performance metrics yet.)
On top of that, he's going to have to engineer his own file dependency solution over memcached since it doesn't provide such a facility.
Wouldn't my original design be best? Does this strike you as overengineering? Is HttpRuntime.Cache caching or memcached caching even necessary?
Not knowing exactly how much data you are talking about (assuming it's not a lot), I tend to somewhat agree with you; raw-speed wise, a static member should be the 'fastest', then Cache. That doesn't necessarily mean it's the best option, of course. Scalability is not always about speed. In fact, the things we do for scalability often negatively (marginally) affect the speed of an application.
More specifically; I do tend to start with the Cache object myself, unless a bit of 'static' data is pretty darn small and is pretty much guaranteed to be needed constantly (in which case I go for static members. Don't forget thread synch too, of course!)
With a modest amount of data that won't change often at all, and can easily be modified when you need to, by altering the files as you note, the Cache object is probably a good solution. memcached may be overkill, and overly complex... but it should work, too.
I think the major possible 'negative' to the memcached solution is the single-point-of-failure issue; Using the local server's Cache keeps each server isolated.
It sounds like there may not really be any choice in your case, politically speaking. But I think your reasoning isn't necessarily all that bad, given what you've shared here.
Very much agree with Andrew here. Few additions/deviations:
For small amount of rarely changing data, static fields would offer best performance. When your caching happens at no UI layer, it avoids taking dependency on System.Web assembly (of course, you can achieve this by other means as well as). However, in general, ASP.NET Cache would also be a good bet (especially when data is large, the cached data can expire if there is memory pressure etc.)
From both speed & scalability, output caching (including browser & down level caching) would be the best option and you should evaluate it. Even if data is changing frequently, output caching for 30-60 seconds can give significant performance boost for very large number of requests. If needed, you can do partial caching (user controls) and/or substitutions. Of course, this needs to be done with combination with data caching.

Caching asp.net pages

I am bulding an app and I tried YSlow and got Grade F on most of my practices. I have loads of jscript that am working on reducing. I want to be able to cache some of these because the pages get called many times.
I have one master age and I wanted to cache the scripts and css files.
How do I achieve this?
Are there any recommended best practices?
Are there any other performance improvements that I can make?
Have you re-read RFC 2616 yet this year? If not, do. Trying to build websites without a strong familiarity with HTTP is like trying to seduce someone when you're extremely drunk; just because lots of other people do it doesn't mean you'll have good performance.
If a resource can be safely reused within a given time period (e.g safe for the next hour/day/month) say so. Use the max-age component of the cache-control header as well as expires (max-age is better than expires, but doing both costs nothing).
If you know the time something last changed, say so in a Last-Modified header (see note below).
If you don't know when something last changed, but can add the ability to know, do so (e.g. timestamp database rows on UPDATE).
If you can keep a record of every time something changed do so, and build an e-tag from it. While E-tags should not be based on times an exception is if you know they can't change in a finer resolution (time to nearest .5 second is fine if you can't have more than 1 change every .5 second, etc.)
If you receive a request with a If-Modified-Since with a date matching last change time or a If-None-Match matching the e-tag, send a 304 instead of the whole page.
Use Gzip or Deflate compression (deflate is slightly better when client says it can handle both) but do note that you must change the e-tag. Sending correct Vary header for this breaks IE caching, so Vary on User-Agent instead (imperfect solution for an imperfect world). If you roll your own compression in .NET note that flushing the compression stream causes bugs, write a wrapper that only flushes the output on Flush() prior to the final flush on Close().
Don't defeat the caching done for you. Turning off e-tags on static files gives you a better YSlow rating and worse performance (except on web-farms, when the more complicated solution recommended by YSlow should be used). Ignore what YSlow says about turning off e-tags (maybe they've fixed that bug now and don't say it any more) unless you are on a web-farm where different server types can deal with the same request (e.g. IIS and Apache dealing with the same URI; Yahoo are which is why this worked for them, most people aren't).
Favour public over private unless inapproprate.
Avoid doing anything that depends on sessions. If you can turn off sessions, so much the better.
Avoid sending large amounts of viewstate. If you can do something without viewstate, so much the better.
Go into IIS and look at the HTTP Headers section. Set appropriate values for static files. Note that this can be done on a per-site, per-directory and per-file basis.
If you have a truly massive file (.js, .css) then give it a version number and put that version in the URI used to access it (blah.js/?version=1.1.2). Then you can set a really long expiry date (1 year) and/or a hard-coded e-tag and not worry about cache staleness as you will change the version number next time and to the rest of the web it's a new resource rather than an updated one.
Edit:
I said "see note below" and didn't add the note.
The last modified time of any resource, is the most recent of:
Anything (script, code-behind) used to create the entity sent.
Anything used as part of it.
Anything that was used as part of it, that has now been deleted.
Of these, number 3 can be the trickiest to work out, since it has after all been deleted. One solution is to keep track of changes of the resource itself, and update this on deletion of anything used to create it, the other is to have a "soft delete" where you have the item still, but marked as deleted and not used in any other way. Just what the best way to track this stuff is, depends on the application.
You should just create separate .js and .css files and the browser does the caching for you. It is also a good idea to use a js-minimizer that removes all the white space from the js-files.
If you have a huge ViewState like > 100Kb try to reduce it as well. If the ViewState is still huge, you can store the ViewState on the server as a file...
http://aspalliance.com/472
You might also use the caching on the page if the page is not too dynamic...
http://msdn.microsoft.com/en-us/library/06bh14hk.aspx
You can also reference common js and css libraries to trusted online stores. For example, if you add jquery as <script src="http://code.jquery.com/jquery-latest.js"></script> the jquery file has probably been cached by the browser of client, because of another web site that references this address before, even if it is cached because of your web site.
This way may have pros and cons but there is such a way.
Also I don't know if response of YSlow changes with this way.

asp.net c# speed up of classes

I work on a big project in company. We collect data which we get via API methods of the CMS.
ex.
DataSet users = CMS.UserHelper.GetLoggedUser(); // returns dataset with users
Now on some pages we need many different data, not just users, also Nodes of the tree of the CMS or specific data of subtreee.
So we thought of write an own "helper class" in which we later can get different data easy.
ex:
MyHelperClass.GetUsers();
MyHelperClass.Objects.GetSingleObject( ID );
Now the problem is our "Helper Class" is really big and now we like to collect different data from the "Helper Class" and write them into a typed dataset . Later we can give a repeater that typed dataset which contains data from different tables. (which even comes from the methods I wrote before via API)
Problem is: It is so slow now, even at loading the page! Does it load or init the whole class??
By the way CMS is Kentico if anyone works with it.
I'm tired. Tried whole night..but it's soooo slow. Please give a look to that architecture.
May be you find some crimes which are not allowed :S
I hope we get it work faster. Thank you.
alt text http://img705.imageshack.us/img705/3087/classj.jpg
Bottlenecks usually come in a few forms:
Slow or flakey network.
Heavy reading/writing to disk, as disk IO is 1000s of times slower than reading or writing to memory.
CPU throttle caused by long-running or inefficiently implemented algorithm.
Lots of things could affect this, including your database queries and indexes, the number of people accessing your site, lack of memory on your web server, lots of reflection in your code, just plain slow hardware etc. No one here can tell you why your site is slow, you need to profile it.
For what its worth, you asked a question about your API architecture -- from a code point of view, it looks fine. There's nothing wrong with copying fields from one class to another, and the performance penalty incurred by wrapper class casting from object to Guid or bool is likely to be so tiny that its negligible.
Since you asked about performance, its not very clear why you're connecting class architecture to performance. There are really really tiny micro-optimizations you could apply to your classes which may or may not affect performance -- but the four or five nanoseconds you'll gain with those micro-optimizations have already been lost simply by reading this answer. Network latency and DB queries will absolutely dwarf the performance subtleties of your API.
In a comment, you stated "so there is no problem with static classes or a basic mistake of me". Performance-wise, no. From a web-app point of view, probably. In particular, static fields are global and initialized once per AppDomain, not per session -- the variables mCurrentCultureCode and mcurrentSiteName sound session-specific, not global to the AppDomain. I'd double-check those to see your site renders correctly when users with different culture settings access the site at the same time.
Are you already using Caching and Session state?
The basic idea being to defer as much of the data loading to these storage mediums as possible and not do it on individual page loads. Caching especially can be useful if you only need to get the data once and want to share it between users and over time.
If you are already doing these things, ore cant directly implement them try deferring as much of this data gathering as possible, opting to short-circuit it and not do the loading up front. If the data is only occasionally used this can also save you a lot of time in page loads.
I suggest you try to profile your application and see where the bottlenecks are:
Slow load from the DB?
Slow network traffic?
Slow rendering?
Too much traffic for the client?
The profiling world should be part of almost every senior programmer. It's part of the general toolbox. Learn it, and you'll have the answers yourself.
Cheers!
First thing first... Enable Trace for your application and try to optimize Response size, caching and work with some Application and DB Profilers... By just looking at the code I am afraid no one can be able to help you better.

Hints for a high-traffic web service, c# asp.net sql2000

I'm developing a web service whose methods will be called from a "dynamic banner" that will show a sort of queue of messages read from a sql server table.
The banner will have a heavy pressure in the home pages of high traffic sites; every time the banner will be loaded, it will call my web service, in order to obtain the new queue of messages.
Now: I don't want that all this traffic drives queries to the database every time the banner is loaded, so I'm thinking to use the asp.net cache (i.e. HttpRuntime.Cache[cacheKey]) to limit database accesses; I will try to have a cache refresh every minute or so.
Obviously I'll try have the messages as little as possible, to limit traffic.
But maybe there are other ways to deal with such a scenario; for example I could write the last version of the queue on the file system, and have the web service access that file; or something mixing the two approaches...
The solution is c# web service, asp.net 3.5, sql server 2000.
Any hint? Other approaches?
Thanks
Andrea
It depends on a lot of things:
If there is little change in the data (think backend with "publish" button or daily batches), then I would definitely use static files (updated via push from the backend). We used this solution on a couple of large sites and worked really well.
If the data is small enough, memory caching (i.e. Http Cache) is viable, but beware of locking issues and also beware that Http Cache will not work that well under heavy memory load, because items can be expired early if the framework needs memory. I have been bitten by it before! With the above caveats, Http Cache works quite well.
I think caching is a reasonable approach and you can take it a step further and add a SQL Dependency to it.
ASP.NET Caching: SQL Cache Dependency With SQL Server 2000
If you go the file route, keep this in mind.
http://petesbloggerama.blogspot.com/2008/02/aspnet-writing-files-vs-application.html
Writing a file is a better solution IMHO - its served by IIS kernel code, w/o the huge asp.net overhead and you can copy the file to CDNs later.
AFAIK dependency cashing is not very efficient with SQL Server 2000.
Also, one way to get around the memory limitation mentioned by Skliwz is that if you are using this service outside of the normal application you can isolate it in it's own app pool. I have seen this done before which helps as well.
Thanks all, as the data are little in size, but the underlying tables will change, I think that I'll go the HttpCache way: I need actually a way to reduce db access, even if the data are changing (so that's the reason to not using a direct Sql dependency as suggested by #Bloodhound).
I'll make some stress test before going public, I think.
Thanks again all.
Of course you could (should) also use the caching features in the SixPack library .
Forward (normal) cache, based on HttpCache, which works by putting attributes on your class. Simplest to use, but in some cases you have to wait for the content to be actually be fetched from database.
Pre-fetch cache, from scratch, which, after the first call will start refreshing the cache behind the scenes, and you are guaranteed to have content without wait in some cases.
More info on the SixPack library homepage. Note that the code (especially the forward cache) is load tested.
Here's an example of simple caching:
[Cached]
public class MyTime : ContextBoundObject
{
[CachedMethod(1)]
public DateTime Get()
{
Console.WriteLine("Get invoked.");
return DateTime.Now;
}
}

Categories

Resources