'Caching' a large table in ASP.NET

'Caching' a large table in ASP.NET - c#

I understand that each page refresh, especially in 'AjaxLand', causes my back-end/code-behind class to be called from scratch... This is a problem because my class (which is a member object in System.Web.UI.Page) contains A LOT of data that it sources from a database. So now every page refresh in AjaxLand is causing me to making large backend DB calls, rather than just to reuse a class object from memory. Any fix for this? Is this where session variables come into play? Are session variables the only option I have to retain an object in memory that is linked to a single-user and a single-session instance?

You need ASP.Net Caching.
Specifically Data Caching.

If your data is user-specific then Session would be the way to go. Be careful if you have a web farm or web garden. In which case you'll need a Session server or database for your session.
If your data is application-level then Application Data Cache could be the way to go. Be careful if you have limited RAM and your data is huge. The cache can empty itself at an inopportune moment.
Either way, you'll need to test how your application performs with your changes. You may even find going back to the database to be the least bad option.
In addition, you could have a look at Lazy Loading some of the data, to make it less heavy.

Take a look at this MS article on various caching mechanisms for ASP.NET. There is a section named "Cache arbitrary objects in server memory" that may interest you.

Since you mention Ajax, I think you might want to consider the following points:
Assume this large data set is static and not transient, in the first call to Ajax, your app queries the database, retrieves lots of data and returns to the client (i.e. the browser/JavaScript running on the browser, etc), the client now has all of that in memory already. Subsequently, there's no need to go back to the server for the same data that your client already has in memory. What you need to do is using JavaScript to rebuild the DOM or whatever. All can be done on the client from this point on.
Now assume the data is not static but transient, caching on the server by putting them is the session won't be the solution that you want anyway. Every time your client sends a request to the server, and the server just returns what's in the cache (session), the data is already stale and there's no difference from the data that the client already has in memory.
The point is if the data is static, save round trips to the server once you already have data in memory. If the data is transient, I am afraid there's no cheap solution except re-querying or re-retrieving the data somehow, and send everything back to the client.

Related

Asp.Net core distributed caching

I am currently using MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()); for caching some data from database that does not change so often, but it does change.
And on create/update/delete of that data I do the refresh of the cache.
This works fine, but the problem is that on production we will have few nodes, so when method for creating of record is called for instance, cache will be refreshed only on that node, not on other nodes, and they will have stale data.
My question is, can I somehow fix this using MemoryCache, or I need to do something else, and if I do, what are the possible solutions?

I think you are looking for is Distributed Caching
Using the IDistributedCache interface you can use either Redis or Sql Server and it supplies basic Get/Set/Remove methods. Changes made on one node will be available to other nodes.
Using Redis is a great way of sharing Session type data between servers in a load balanced environment, Sql Server does not seem to be a great fit given that you seem to be caching to avoid db calls.
It might also be worth considering if you are actually complicating things by caching in the first place. When you have a single application you see the benefit, as keeping them in application memory saves a request over the network, but when you have a load balanced scenario, you have to compare retrieving those records from a distributed cached vs retrieving them from the database.
If the data is just an in memory copy of a relatively small database table, then there is probably not a lot to choose performance wise between the two. If the data is based on a complicated expensive query then the cache is the way to go.
If you are making hundreds of requests a minute for the data, then any network request may be too much, but you can consider what are the consequences of the data being a little stale? For example, if you update a record, and the new record is not available immediately on every server, does your application break? Or does the change just occur in a more phased way? In that case you could keep your in process memory cache, just use a shorter Time To Live.
If you really need every change to propagate to every node straight away then you could consider using a library like Cache Manager in conjunction with Redis which can combine an in memory cache and synchronisation with a remote cache.

Somewhat dated question, but maybe still useful: I agree with what ste-fu said, well explained.
I'll only add that, on top of CacheManager, you may want to take a look at FusionCache ⚡🦥, which I recently released.
On top of supporting an optional distributed 2nd layer transparently managed for you, it also has some other nice features like an optimization that prevents multiple concurrent factory for the same cache key from being executed (less load on the source database), a fail-safe mechanism and advanced timeouts with background factory completion
If you will give it a chance please let me know what you think.
/shameless-plug

Should I use sessions?

I am designing an online time tracking software to be used internally. I am fairly new to c# and .NET though I have extensive PHP experience.
I am using Windows Forms Authentication, and once the user logs in using that, I create a Timesheet object (my own custom class).
As part of this class, I have a constructor that checks the SQL DB for information (recent entries by this user, user preferences, etc.)
Should I be storing this information in a session? And then checking the session object in the constructor first? That seems the obvious approach, but most examples I've looked at don't make much use of sessions. Is there something I don't know that others do (specifically related to .NET sessions of course)?
EDIT:
I forgot to mention two things. 1. My SQL DB is on another server (though I believe they are both on the same network, so not much of an issue)2. There are certain constants that the user will not be able to change (only the admin can modify them) such as project tasks. These are used on every page, but loaded the first time from the DB. Should I be storing these in a session? If not, where else? The only other way I can think of is a local flat file that updates each time the table of projects is updated, but that seems like a hack solution. Am I trying too hard to minimize calls to the DB?

There is a good overview on ASP.NET Session here: ASP.NET Session State.
If you don't have thousands of clients, but need "some state" stored server-side, this is very easy to use and works well. It can also be stored in the database in multi server scenarios, without changing a line in your code, just by configuration.
My advise would be not to store "big", or full object hierarchies in there, as storing in a session (if the session is shared among servers in a web farm in a database for example) can be somewhat costy. If you plan to have only one server, this is not really a problem, but you have to know that you won't be able to easily move to a multiple server mode easily.
The worst thing to do is follow the guys who just say "session is bad, whooooo!", don't use it, and eventually rewrite your own system. If you need it, use it :-)

I would shy away from session objects. And actually I would say look into .net MVC as well.
The reason I don't use the session is because I feel it can be a crutch for some developers.
I would save all of the information that you would have put into a session into a db. This will allow for better metrics tracking, support for Azure (off topic but worth mentioning) and is cleaner imo.

ASP developers know session state as a great feature, but one that is somewhat limited. These limitations include:
ASP session state exists in the process that hosts ASP; thus the actions that affect the process also affect session state. When the process is recycled or fails, session state is lost.
Server farm limitations. As users move from server to server in a Web server farm, their session state does not follow them. ASP session state is machine specific. Each ASP server provides its own session state, and unless the user returns to the same server, the session state is inaccessible. http://msdn.microsoft.com/en-us/library/ms972429.aspx

One of the main problems with Session is, that by default, it is stored in memory. If you have many concurrent users that store data in the session this could easily lead to performance problems.
Another thing is that application recycle will empty your in memory session which could lead to errors.
Off course you can move your session to SqlServer or a StateServer but then you will lose on performance.

Look into the HttpContext.User (IPrincipal) property. this is where user information is stored in the request.

Most people avoid session state simply because people like to avoid state in general. If you can find an algorithm or process which works all the time regardless of the previous state of an object, that process tends to be more fool proof against future maintenance and more easily testable.
I would say for this particular case, store your values in the database and read them from there any time you need that information. Once you have that working, take a look at the performance of the site. If it's performing fine then leave it alone (as this is the simplest case to program). If performance is an issue, look at using the IIS Cache (instead of session) or implementing a system like CQRS.

Session State Disadvantage
Session-state variables stay in memory until they are either removed or replaced, and therefore can degrade server performance. Session-state variables that contain blocks of information, such as large datasets, can adversely affect Web-server performance as server load increases. Think what will happen if you significant amount of users simultaneously online.
NOTE :- I haven't mentioned the advantages because they are straightforward which are : Simple implementation, Session-specific events, Data persistence, Cookieless support etc.

The core problem with sessions are scaleability. If you have a small application, with a small number of users, that will only ever be on one server, then it may be a good route for you to save small amounts of data - maybe just the user id - to allow quick access to the preferences etc.
If you MAY want multiple web servers, or the application MAY grow, then don't use session. And only use it for small pieces of information.

DAL, Session, Cache architecture

I'm attempting to create Data Access Layer for my web application. Currently, all datatables are stored in the session. When I am finished the DAL will populate and return datatables. Is it a good idea to store the returned datatables in the session? A distributed/shared cache? Or just ping the database each time? Note: generally the number of rows in the datatable will be small < 2000.
Additional info:
Almost none of the data is shared. The parameters that are sent to the SQL queries are chosen by the user. The parameter values available to the user are based on who the user is. In most cases it is impossible for two users to run the same sql queries. However, the same user can run the same query more than once.
More info:
Number of concurrent users ~50,000
Important info:
In 99% of the cases no two users will have the same data/queries, however, the same user may run the same query/get the same data multiple times.
Thanks

Storing the data in session is not a good idea because:
Every user gets a separate copy of the same data - enormous waste of server memory.
IIS will recycle a session if you fill it with too much data.
I recommend storing the data tables in Cache, and also populating each table only when first requested rather than all at once. That way, if IIS starts reclaiming space in the cache, your code won't be affected.
Very simple example of fetching on demand:
T GetCached<T>(string cacheKey, Func<T> getDirect) {
object value = HttpContext.Current.Cache.Item(cacheKey);
if(value == null) {
value = getDirect();
HttpContext.Current.Cache.Insert(cacheKey, value);
}
return (T) value;
}
EDIT: - Question Update
Cache vs local Session - Local session state is all-or-nothing. If it gets too full, IIS will recycle everything in it. By contrast, cache items are dropped individually when memory gets too low, so it's much less of a problem.
Cache vs Session state server - I don't have any data to back this up, so please say so if I've got this wrong, but I would have thought that caching the data independently in memory in each physical server AppDomain would scale better than storing it in a shared session state service.

The first thing I would say is: cache is not mandatory everywhere. You should use it wisely and very specially on bottlenecks related to data access.
I don't think it's a good idea to store 1000 different datatables with 2000 records anywhere. If queries are so dynamic that having the same query in a short period of time is the exception then cache doesn't seem a good option.
And in relation to a distributed cache option, I suggest you to check http://memcached.org . A distributed cache used by many big projects around the world.
I know Velocity is near, but so far I know it needs Windows Server 2008 and it's something very very new yet. Normally Microsoft products are good from version 2.0 :-)

Store lookups/dictionaries - and items that your app would require very frequently in Application or Cache object; query database for data that depends upon the user role.
--EDIT--
This is in response to your comment.
Usually in any data oriented system, the queries run around the facts table(or tables that are inevitable to query); assuming you do have a set of inevitable tables, so you can use Cache.Insert():
Load the inevitable tables on app startup;
Load most queried tables in Cache upon table request-basis;
Query database for least queried tables.
If you do not have any performance issues then let SQL handle everything.

Storing that amount of data in the Session is a very bad idea. Each user will get their own version!
If this is shared data (same for all users), consider moving it to the Application object.

Strategies for Cache Access During Refresh?

I’m looking for some strategies regarding accessing some cached data that resides in a internal company web service. Actually, preventing access of the cached data while the cache is being refreshed.
We have a .Net 3.5 C# web service running on a web farm that maintains a cache of a half-dozen or so datasets. This data is configuration associated items that are referenced by the ‘real’ business logic domain that is also running in this web service as well as being returned for any client uses. Probably talking a total of dozen or so tables with a few thousand records in them.
We implemented a caching mechanism using the MS Enterprise Library 4.1. No huge reason for using this over the ASP.Net cache except that we were already using Enterprise Library for some other things and we liked the cache expiration handling. This is the first time that we have implemented some caching here so maybe I’m missing something fundamental…
This configuration data doesn’t get changed too often – probably a couple of times a day. When this configuration data does change we update the cache on the particular server the update request went to with the new data (the update process goes through the web service). For those other servers in the web farm (currently a total of 3 servers), we have the cache expiration set to 15 minutes upon which the data is re-loaded from the single database that all servers in the farm hit. For our particular purposes, this delay between servers is acceptable (although I guess not ideal).
During this refresh process, other requests could come in that require access to the data. Since the request could come during an expiration/refresh process, there is no data currently in the cache, which obviously causes issues.
What are some strategies to resolve this? If this was going in a single domain sort of WinForm type of application we could hack something up that would prevent access during the refresh by the use of class variables/loops, threading/mutex, or some other singleton-like structure. But I’m leery on implementing something like that running on a web farm. Should I be? Is a distributed server caching mechanism the way to go instead of each server having its own cache? I would like to avoid doing that for now if I could and come up with some coding to get around this problem. Am I missing something?
Thanks for any input.
UPDATE: I was going to use the Lock keyword functionality around the expiration action that subsequently refreshes the data, but I was worried about doing this on a web server. I think that would have worked although it seems to me that there still would be a possibility (although a lesser one) that we could have grabbed data from the empty cache between the time it expired and the time the lock was entered (the expiration action occurs on another thread I think). So what we did was if there was no data in the cache during a regular request for data we assume that it is in the process of being refreshed and just grab the data from the source instead. I think this will work since we can assume that the cache should be filled at all times since the initial cache filling process will occur when the singleton class that holds the cache is created when a web service request is first made. So if the cache is empty it truly means that it is currently being filled, which normally only takes a few seconds so any requests for data from the cache during that time will be the only ones that aren't hitting the cache.
If anyone with experience would like to shed any more light on this, it would be appreciated.

It sounds to me like you are already serving out stale data. So, if that is allowed, why don't you populate a new copy of the cache when you discover its old and only switch to using it once its completely populated.

It really depends on the updating logic. Where is that you decide to update the cache? Can you propagate the update to all the servers in the farm? Then you should lock while updating. If your update process is initiated by a user action, can you let the other servers know that they should expire their cache?

Hints for a high-traffic web service, c# asp.net sql2000

I'm developing a web service whose methods will be called from a "dynamic banner" that will show a sort of queue of messages read from a sql server table.
The banner will have a heavy pressure in the home pages of high traffic sites; every time the banner will be loaded, it will call my web service, in order to obtain the new queue of messages.
Now: I don't want that all this traffic drives queries to the database every time the banner is loaded, so I'm thinking to use the asp.net cache (i.e. HttpRuntime.Cache[cacheKey]) to limit database accesses; I will try to have a cache refresh every minute or so.
Obviously I'll try have the messages as little as possible, to limit traffic.
But maybe there are other ways to deal with such a scenario; for example I could write the last version of the queue on the file system, and have the web service access that file; or something mixing the two approaches...
The solution is c# web service, asp.net 3.5, sql server 2000.
Any hint? Other approaches?
Thanks
Andrea

It depends on a lot of things:
If there is little change in the data (think backend with "publish" button or daily batches), then I would definitely use static files (updated via push from the backend). We used this solution on a couple of large sites and worked really well.
If the data is small enough, memory caching (i.e. Http Cache) is viable, but beware of locking issues and also beware that Http Cache will not work that well under heavy memory load, because items can be expired early if the framework needs memory. I have been bitten by it before! With the above caveats, Http Cache works quite well.

I think caching is a reasonable approach and you can take it a step further and add a SQL Dependency to it.
ASP.NET Caching: SQL Cache Dependency With SQL Server 2000

If you go the file route, keep this in mind.
http://petesbloggerama.blogspot.com/2008/02/aspnet-writing-files-vs-application.html

Writing a file is a better solution IMHO - its served by IIS kernel code, w/o the huge asp.net overhead and you can copy the file to CDNs later.
AFAIK dependency cashing is not very efficient with SQL Server 2000.

Also, one way to get around the memory limitation mentioned by Skliwz is that if you are using this service outside of the normal application you can isolate it in it's own app pool. I have seen this done before which helps as well.

Thanks all, as the data are little in size, but the underlying tables will change, I think that I'll go the HttpCache way: I need actually a way to reduce db access, even if the data are changing (so that's the reason to not using a direct Sql dependency as suggested by #Bloodhound).
I'll make some stress test before going public, I think.
Thanks again all.

Of course you could (should) also use the caching features in the SixPack library .
Forward (normal) cache, based on HttpCache, which works by putting attributes on your class. Simplest to use, but in some cases you have to wait for the content to be actually be fetched from database.
Pre-fetch cache, from scratch, which, after the first call will start refreshing the cache behind the scenes, and you are guaranteed to have content without wait in some cases.
More info on the SixPack library homepage. Note that the code (especially the forward cache) is load tested.
Here's an example of simple caching:
[Cached]
public class MyTime : ContextBoundObject
{
[CachedMethod(1)]
public DateTime Get()
{
Console.WriteLine("Get invoked.");
return DateTime.Now;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.