Hints for a high-traffic web service, c# asp.net sql2000 - c#

I'm developing a web service whose methods will be called from a "dynamic banner" that will show a sort of queue of messages read from a sql server table.
The banner will have a heavy pressure in the home pages of high traffic sites; every time the banner will be loaded, it will call my web service, in order to obtain the new queue of messages.
Now: I don't want that all this traffic drives queries to the database every time the banner is loaded, so I'm thinking to use the asp.net cache (i.e. HttpRuntime.Cache[cacheKey]) to limit database accesses; I will try to have a cache refresh every minute or so.
Obviously I'll try have the messages as little as possible, to limit traffic.
But maybe there are other ways to deal with such a scenario; for example I could write the last version of the queue on the file system, and have the web service access that file; or something mixing the two approaches...
The solution is c# web service, asp.net 3.5, sql server 2000.
Any hint? Other approaches?
Thanks
Andrea

It depends on a lot of things:
If there is little change in the data (think backend with "publish" button or daily batches), then I would definitely use static files (updated via push from the backend). We used this solution on a couple of large sites and worked really well.
If the data is small enough, memory caching (i.e. Http Cache) is viable, but beware of locking issues and also beware that Http Cache will not work that well under heavy memory load, because items can be expired early if the framework needs memory. I have been bitten by it before! With the above caveats, Http Cache works quite well.

I think caching is a reasonable approach and you can take it a step further and add a SQL Dependency to it.
ASP.NET Caching: SQL Cache Dependency With SQL Server 2000

If you go the file route, keep this in mind.
http://petesbloggerama.blogspot.com/2008/02/aspnet-writing-files-vs-application.html

Writing a file is a better solution IMHO - its served by IIS kernel code, w/o the huge asp.net overhead and you can copy the file to CDNs later.
AFAIK dependency cashing is not very efficient with SQL Server 2000.

Also, one way to get around the memory limitation mentioned by Skliwz is that if you are using this service outside of the normal application you can isolate it in it's own app pool. I have seen this done before which helps as well.

Thanks all, as the data are little in size, but the underlying tables will change, I think that I'll go the HttpCache way: I need actually a way to reduce db access, even if the data are changing (so that's the reason to not using a direct Sql dependency as suggested by #Bloodhound).
I'll make some stress test before going public, I think.
Thanks again all.

Of course you could (should) also use the caching features in the SixPack library .
Forward (normal) cache, based on HttpCache, which works by putting attributes on your class. Simplest to use, but in some cases you have to wait for the content to be actually be fetched from database.
Pre-fetch cache, from scratch, which, after the first call will start refreshing the cache behind the scenes, and you are guaranteed to have content without wait in some cases.
More info on the SixPack library homepage. Note that the code (especially the forward cache) is load tested.
Here's an example of simple caching:
[Cached]
public class MyTime : ContextBoundObject
{
[CachedMethod(1)]
public DateTime Get()
{
Console.WriteLine("Get invoked.");
return DateTime.Now;
}
}

Related

Asp.Net core distributed caching

I am currently using MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()); for caching some data from database that does not change so often, but it does change.
And on create/update/delete of that data I do the refresh of the cache.
This works fine, but the problem is that on production we will have few nodes, so when method for creating of record is called for instance, cache will be refreshed only on that node, not on other nodes, and they will have stale data.
My question is, can I somehow fix this using MemoryCache, or I need to do something else, and if I do, what are the possible solutions?
I think you are looking for is Distributed Caching
Using the IDistributedCache interface you can use either Redis or Sql Server and it supplies basic Get/Set/Remove methods. Changes made on one node will be available to other nodes.
Using Redis is a great way of sharing Session type data between servers in a load balanced environment, Sql Server does not seem to be a great fit given that you seem to be caching to avoid db calls.
It might also be worth considering if you are actually complicating things by caching in the first place. When you have a single application you see the benefit, as keeping them in application memory saves a request over the network, but when you have a load balanced scenario, you have to compare retrieving those records from a distributed cached vs retrieving them from the database.
If the data is just an in memory copy of a relatively small database table, then there is probably not a lot to choose performance wise between the two. If the data is based on a complicated expensive query then the cache is the way to go.
If you are making hundreds of requests a minute for the data, then any network request may be too much, but you can consider what are the consequences of the data being a little stale? For example, if you update a record, and the new record is not available immediately on every server, does your application break? Or does the change just occur in a more phased way? In that case you could keep your in process memory cache, just use a shorter Time To Live.
If you really need every change to propagate to every node straight away then you could consider using a library like Cache Manager in conjunction with Redis which can combine an in memory cache and synchronisation with a remote cache.
Somewhat dated question, but maybe still useful: I agree with what ste-fu said, well explained.
I'll only add that, on top of CacheManager, you may want to take a look at FusionCache ⚡🦥, which I recently released.
On top of supporting an optional distributed 2nd layer transparently managed for you, it also has some other nice features like an optimization that prevents multiple concurrent factory for the same cache key from being executed (less load on the source database), a fail-safe mechanism and advanced timeouts with background factory completion
If you will give it a chance please let me know what you think.
/shameless-plug

Is It Bad Practice To Use Static Members In ASP.NET Website?

I understand that a static member will be shared by all users of an ASP.NET website; but in this particular case - that's exactly what I want.
It's a private-use webpage I threw together to facilitate web-based chatting between two users. I wanted to avoid persisting data to a database or a datafile, and thought I could store the last X messages in a static concurrent queue. This seems to work great on my development machine.
I'm very inexperienced with ASP.NET, but in all of the examples I've found, none use this approach. Is this a bad-practice, are there 'gotchas' I should be aware of? The alternative, that I can see, is to use a database. But I felt like it would be more effort and, my guess, is more resources (I figure my 'buffer' of messages will take about 40kb of memory and save quite a few trips to the database).
Assuming that you make sure that the entire thing is thread-safe, that will work.
However, IIS can recycle your AppDomain at any time, so your queue may get blow away when you don't expect it.
Even if IIS wouldn't flush and restart your AppDomain every now and then, using static variables for this purpose sounds like a smelly hack to me.
The HttpApplicationState class provides access to an application-wide cache you can use to store information.
ASP.NET Application State Overview
This is perfectly fine as long as your requirements don't change and you are OK with randomly loosing all messages on server side.
I would slightly refactor code to provide "message storage" interface to simplify testing of the code (with potential benefit in the future if you decide to make it more complicated/persisted/multi-user).
Pro of the static storage approach (or HttpApplicationState):
no issues with server side storage of the messages - less privacy concerns. Nothing is stored forever so you can say whatever you want.
extremely simple implementation.
perfect for IM / phone conversation.
unlikely to have performance problems in single server case
Cons:
messages can be lost. Can be mitigated by storing history on the client (i.e. retrieving message with AJAX queries on the same web page)
require more care if data is sensitive when more users are involved/or application is shared with some other code as static data is visible to everyone. Also not much different from any other storage.
Can't be directly migrated to multiple servers/web garden scenario. Really unlikely issue for 2 person chat server.
Sure, one gotcha I've seen in the past has been the use of static variables with Web Gardens.
See this SO question:
Web Garden and Static Objects difficult to understand
Note a key point from the discussion:
Static objects are not shared in web gardens/web farms.

What is the most cost-effective way to break up a centralised database?

Following on from this question...
What to do when you’ve really screwed up the design of a distributed system?
... the client has reluctantly asked me to quote for option 3 (the expensive one), so they can compare prices to a company in India.
So, they want me to quote (hmm). In order for me to get this as accurate as possible, I will need to decide how I'm actually going to do it. Here's 3 scenarios...
Scenarios
Split the database
My original idea (perhaps the most tricky) will yield the best speed on both the website and the desktop application. However, it may require some synchronising between the two databases as the two "systems" so heavily connected. If not done properly and not tested thouroughly, I've learnt that synchronisation can be hell on earth.
Implement caching on the smallest system
To side-step the sync option (which I'm not fond of), I figured it may be more productive (and cheaper) to move the entire central database and web service to their office (i.e. in-house), and have the website (still on the hosted server) download data from the central office and store it in a small database (acting as a cache)...
Set up a new server in the customer's office (in-house).
Move the central database and web service to the new in-house server.
Keep the web site on the hosted server, but alter the web service URL so that it points to the office server.
Implement a simple cache system for images and most frequently accessed data (such as product information).
... the down-side is that when the end-user in the office updates something, their customers will effectively be downloading the data from a 60KB/s upload connection (albeit once, as it will be cached).
Also, not all data can be cached, for example when a customer updates their order. Also, connection redundancy becomes a huge factor here; what if the office connection is offline? Nothing to do but show an error message to the customers, which is nasty, but a necessary evil.
Mystery option number 3
Suggestions welcome!
SQL replication
I had considered MSSQL replication. But I have no experience with it, so I'm worried about how conflicts are handled, etc. Is this an option? Considering there are physical files involved, and so on. Also, I believe we'd need to upgrade from SQL express to SQL non-free, and buy two licenses.
Technical
Components
ASP.Net website
ASP.net web service
.Net desktop application
MSSQL 2008 express database
Connections
Office connection: 8 mbit down and 1 mbit up contended line (50:1)
Hosted virtual server: Windows 2008 with 10 megabit line
Having just read for the first time your original question related to this I'd say that you may have laid the foundation for resolving the problem simply because you are communicating with the database by a web service.
This web service may well be the saving grace as it allows you to split the communications without affecting the client.
A good while back I was involved in designing just such a system.
The first thing that we identified was that data which rarely changes - and immediately locked all of this out of consideration for distribution. A manual process for administering using the web server was the only way to change this data.
The second thing we identified was that data that should be owned locally. By this I mean data that only one person or location at a time would need to update; but that may need to be viewed at other locations. We fixed all of the keys on the related tables to ensure that duplication could never occur and that no auto-incrementing fields were used.
The third item was the tables that were truly shared - and although we worried a lot about these during stages 1 & 2 - in our case this part was straight-forwards.
When I'm talking about a server here I mean a DB Server with a set of web services that communicate between themselves.
As designed our architecture had 1 designated 'master' server. This was the definitive for resolving conflicts.
The rest of the servers were in the first instance a large cache of anything covered by item1. In fact it wasn't a large cache but a database duplication but you get the idea.
The second function of the each non-master server was to coordinate changes with the master. This involved a very simplistic process of actually passing through most of the work transparently to the master server.
We spent a lot of time designing and optimising all of the above - to finally discover that the single best performance improvement came from simply compressing the web service requests to reduce bandwidth (but it was over a single channel ISDN, which probably made the most difference).
The fact is that if you do have a web service then this will give you greater flexibility about how you implement this.
I'd probably start by investigating the feasability of implementing one of the SQL server replication methods
Usual disclaimers apply:
Splitting the database will not help a lot but it'll add a lot of nightmare. IMO, you should first try to optimize the database, update some indexes or may be add several more, optimize some queries and so on. For database performance tuning I recommend to read some articles from simple-talk.com.
Also in order to save bandwidth you can add bulk processing to your windows client and also add zipping (archiving) to your web service.
And probably you should upgrade to MS SQL 2008 Express, it's also free.
It's hard to recommend a good solution for your problem using the information I have. It's not clear where is the bottleneck. I strongly recommend you to profile your application to find exact place of the bottleneck (e.g. is it in the database or in fully used up channel and so on) and add a description of it to the question.
EDIT 01/03:
When the bottleneck is an up connection then you can do only the following:
1. Add archiving of messages to service and client
2. Implement bulk operations and use them
3. Try to reduce operations count per user case for the most frequent cases
4. Add a local database for windows clients and perform all operations using it and synchronize the local db and the main one on some timer.
And sql replication will not help you a lot in this case. The most fastest and cheapest solution is to increase up connection because all other ways (except the first one) will take a lot of time.
If you choose to rewrite the service to support bulking I recommend you to have a look at Agatha Project
Actually hearing how many they have on that one connection it may be time to up the bandwidth at the office (not at all my normal response) If you factor out the CRM system what else is a top user of the bandwidth? It maybe the they have reached the point of needing more bandwidth period.
But I am still curious to see how much information you are passing that is getting used. Make sure you are transferring efferently any chance you could add some easy quick measures to see how much people are actually consuming when looking at the data.

Strategies for Cache Access During Refresh?

I’m looking for some strategies regarding accessing some cached data that resides in a internal company web service. Actually, preventing access of the cached data while the cache is being refreshed.
We have a .Net 3.5 C# web service running on a web farm that maintains a cache of a half-dozen or so datasets. This data is configuration associated items that are referenced by the ‘real’ business logic domain that is also running in this web service as well as being returned for any client uses. Probably talking a total of dozen or so tables with a few thousand records in them.
We implemented a caching mechanism using the MS Enterprise Library 4.1. No huge reason for using this over the ASP.Net cache except that we were already using Enterprise Library for some other things and we liked the cache expiration handling. This is the first time that we have implemented some caching here so maybe I’m missing something fundamental…
This configuration data doesn’t get changed too often – probably a couple of times a day. When this configuration data does change we update the cache on the particular server the update request went to with the new data (the update process goes through the web service). For those other servers in the web farm (currently a total of 3 servers), we have the cache expiration set to 15 minutes upon which the data is re-loaded from the single database that all servers in the farm hit. For our particular purposes, this delay between servers is acceptable (although I guess not ideal).
During this refresh process, other requests could come in that require access to the data. Since the request could come during an expiration/refresh process, there is no data currently in the cache, which obviously causes issues.
What are some strategies to resolve this? If this was going in a single domain sort of WinForm type of application we could hack something up that would prevent access during the refresh by the use of class variables/loops, threading/mutex, or some other singleton-like structure. But I’m leery on implementing something like that running on a web farm. Should I be? Is a distributed server caching mechanism the way to go instead of each server having its own cache? I would like to avoid doing that for now if I could and come up with some coding to get around this problem. Am I missing something?
Thanks for any input.
UPDATE: I was going to use the Lock keyword functionality around the expiration action that subsequently refreshes the data, but I was worried about doing this on a web server. I think that would have worked although it seems to me that there still would be a possibility (although a lesser one) that we could have grabbed data from the empty cache between the time it expired and the time the lock was entered (the expiration action occurs on another thread I think). So what we did was if there was no data in the cache during a regular request for data we assume that it is in the process of being refreshed and just grab the data from the source instead. I think this will work since we can assume that the cache should be filled at all times since the initial cache filling process will occur when the singleton class that holds the cache is created when a web service request is first made. So if the cache is empty it truly means that it is currently being filled, which normally only takes a few seconds so any requests for data from the cache during that time will be the only ones that aren't hitting the cache.
If anyone with experience would like to shed any more light on this, it would be appreciated.
It sounds to me like you are already serving out stale data. So, if that is allowed, why don't you populate a new copy of the cache when you discover its old and only switch to using it once its completely populated.
It really depends on the updating logic. Where is that you decide to update the cache? Can you propagate the update to all the servers in the farm? Then you should lock while updating. If your update process is initiated by a user action, can you let the other servers know that they should expire their cache?

Long-term Static Page Caching

I maintain several client sites that have no dynamic data whatsoever, everything is static asp.net with c#.
Are there any pitfalls to caching the entire page for extreme periods of time, like a week?
Kibbee, We use a couple controls on the sites (ad rotator, some of the ajax extensions) on the sites. They could probably be completely written in html but for convenience sake I just stuck with what we use for every other site.
The only significant pitfall to long cache times occurs when you want to update that data. To be safe, you have to assume that it will take up to a week for the new version to become available. Intermediate hosts such as a ISP level proxy servers often do cache aggressively so this delay will happen.
If there are large files to be cached, I'd look at ensuring your content engine supports If-Modified-Since.
For smaller files (page content, CSS, images, etc), where reducing the number of round-trips is the key, having a long expiry time (a year?) and changing the URL when the content changes is the best. This lets you control when user agents will fetch the new content.
Yahoo! have published a two part article on reducing HTTP requests and browser cache usage. I won't repeat it all here, but these are good reads which will guide you on what to do.
My feeling is to pick a time period high enough to cover most users single sessions but low enough to not cause too much inconvenience should you wish to update the content. Be sure to support If-Modified-Since if you have a Last-Modified for all your content.
Finally, if your content is cacheable at all and you need to push new content out now, you can always use a new URL. This final cachable content URL can sit behind a fixed HTTP 302 redirect URL should you wish to publish a permanent link to the latest version.
We have a similar issue on a project I am working on. There is data that is pretty much static, but is open to change..
What I ended up doing is saving the data to a local file and then monitoring it for changes. The DB server is then never hit unless we remove the file, in which case it will scoot of to the DB and regenerate the data file.
So what we basically have a little bit of disk IO while loading/saving, no traffic to the DB server unless necessary and we are still in control of it (we can either delete manually or script it etc).
I should also add is that you could then tie this up with the actual web server caching model if you wanted to reduce the disk IO (we didnt really need to in our case)..
This could be totally the wrong way to go about it, but it seems to work quite nice for us :)
If it's static, why bother caching at all? Let IIS worry about it.
When you say that you have no data, how are you even using asp.net or c#. What functionality does that provide you over plain HTML? Also, if you do plan on caching, it's probably best to cache to a file, and then when a request is made, stream out the file. The OS will take care of keeping the file in memory so that you won't have to read it off the disk all the time.
You may want to build in a cache updating mechanism if you want to do this, just to make sure you can clear the cache if you need to do a code update. Other than that, there aren't any problems that I can think of.
If it is static you would probably be better off generating the pages once and then serve up the resulting static HTML file directly.

Categories

Resources