DAL, Session, Cache architecture - c#

I'm attempting to create Data Access Layer for my web application. Currently, all datatables are stored in the session. When I am finished the DAL will populate and return datatables. Is it a good idea to store the returned datatables in the session? A distributed/shared cache? Or just ping the database each time? Note: generally the number of rows in the datatable will be small < 2000.
Additional info:
Almost none of the data is shared. The parameters that are sent to the SQL queries are chosen by the user. The parameter values available to the user are based on who the user is. In most cases it is impossible for two users to run the same sql queries. However, the same user can run the same query more than once.
More info:
Number of concurrent users ~50,000
Important info:
In 99% of the cases no two users will have the same data/queries, however, the same user may run the same query/get the same data multiple times.
Thanks

Storing the data in session is not a good idea because:
Every user gets a separate copy of the same data - enormous waste of server memory.
IIS will recycle a session if you fill it with too much data.
I recommend storing the data tables in Cache, and also populating each table only when first requested rather than all at once. That way, if IIS starts reclaiming space in the cache, your code won't be affected.
Very simple example of fetching on demand:
T GetCached<T>(string cacheKey, Func<T> getDirect) {
object value = HttpContext.Current.Cache.Item(cacheKey);
if(value == null) {
value = getDirect();
HttpContext.Current.Cache.Insert(cacheKey, value);
}
return (T) value;
}
EDIT: - Question Update
Cache vs local Session - Local session state is all-or-nothing. If it gets too full, IIS will recycle everything in it. By contrast, cache items are dropped individually when memory gets too low, so it's much less of a problem.
Cache vs Session state server - I don't have any data to back this up, so please say so if I've got this wrong, but I would have thought that caching the data independently in memory in each physical server AppDomain would scale better than storing it in a shared session state service.

The first thing I would say is: cache is not mandatory everywhere. You should use it wisely and very specially on bottlenecks related to data access.
I don't think it's a good idea to store 1000 different datatables with 2000 records anywhere. If queries are so dynamic that having the same query in a short period of time is the exception then cache doesn't seem a good option.
And in relation to a distributed cache option, I suggest you to check http://memcached.org . A distributed cache used by many big projects around the world.
I know Velocity is near, but so far I know it needs Windows Server 2008 and it's something very very new yet. Normally Microsoft products are good from version 2.0 :-)

Store lookups/dictionaries - and items that your app would require very frequently in Application or Cache object; query database for data that depends upon the user role.
--EDIT--
This is in response to your comment.
Usually in any data oriented system, the queries run around the facts table(or tables that are inevitable to query); assuming you do have a set of inevitable tables, so you can use Cache.Insert():
Load the inevitable tables on app startup;
Load most queried tables in Cache upon table request-basis;
Query database for least queried tables.
If you do not have any performance issues then let SQL handle everything.

Storing that amount of data in the Session is a very bad idea. Each user will get their own version!
If this is shared data (same for all users), consider moving it to the Application object.

Related

How to manage frequent data access in .net application?

I have three tables in my sql Database say Specials, Businesses, Comments. And in my master page i have a prompt area where i need to display alternative data from these 3 tables based on certain conditions during each page refresh (These tables have more than 1000 records). So in that case what will be the best option to retrieve data from these tables?
Accessing data each time from database is not a good idea i know, is there any other good way to do this, like Caching or any other new techniques to effectively manage this. Now it takes too much time to load the page after each page refresh.
Please give your suggestions.
At present what i was planning is to create a SP for data retrieval and to keep the value returned in a Session.
So that we can access the data from this session rather going to DB each time on page refresh. But do not know is there any other effective way to accomplish the same.
Accessing data each time from database is not a good idea
It not always true, it depends on how frequently the data is getting changed. If you choose to cache the data, you will have to revalidate it every time the data is changed. I am assuming you do not want to display a static count or something that once displayed will not change. If that's not the case, you can simply store in cookies and display from there.
Now it takes too much time to load the page after each page refresh.
Do you know what takes too much time? Is it client side code or server side code (use Glimpse to know that)? If server side, is it the code that hits the DB and the query execution time or its server side in memory manipulation.
Generally first step to improve performance is to measure it precisely and in order for you to solve such issues you ought to know where the problem is.
Based on your first statement, If i were you, I would display each count in a separate div which will be refreshed asynchronously. You could choose to update the data periodically using a timer or even better push it from server (use SignalR). The update will happen transparently so no page reload required.
Hope this helps.
I agree that 1000 records doesn't seem like a lot, but if you really aren't concerned about there being a slight delay you may try using HttpContext.Cache object. It's very much like a dictionary with string keys and object values, with the addition that you can set expirations etc...
Excuse typos, on mobile so no compile check:
var tableA = HttpContext.Cache.Get("TableA")
if tableA == null {
//if its null, there was no copy in the cache so create your
//object using your database call
tableA = Array, List, however you store your data
//add the item to the cache, with an expiration of 1 minute
HTTPContext.Cache.Insert("TableA", tableA, null, NoAbsoluteExpiration, TimeSpan(0,1,0))
}
Now, no matter how many requests go through, you only hit the database once a minute, or once for however long you think is reasonable considering your needs. You can also trigger a removal of the item from cache, if some particular condition happens.
One suggestion is to think of your database as a mere repository to persist state. Your application tier could cache collections of your business objects, persist them when they change, and immediately return state to your presentation tier (the web page).
This assumes all updates to the data are coming from your page. If the database is being populated from different places, you'll need to either tie everything into a common application tier, or poll the database to update your cache.

How to get real time update of data to main warehouse

All,
Need some info.
We have stores at multiple locations and use client server app installed for sales activity.
sales data is stored in database which is setup in all stores...
# end of day - a batch pulls data from all of the store locations and update main warehouse database.
We want to have real time implementation so that whenever there is transcation # any store... data will update immediately to main warehouse repository.
Any clue as how can we achive real time update of data to main warehouse ?
Thanks in advance...
One approach to this is called replication. There are several ways to do it in SQL Server. You're probably looking for transaction replication or merge replication.
Here's a place to start in the SQL Server 2012 documentation.
And here's a fairly recent overview that might be helpful.
You should make sure you understand what "real time" means, and how real time you really need to be. If you are not pre aggregating data and then storing it in the WH, then you should be able to set up replication between the database servers (if they can talk to each other). If you are loading an aggregate, then it gets tricky because you have to merge the measures (facts) into the warehouses existing measures, which is tough. If you don't need true real time, just a slow trickle, then consider simply running your current process on a schedule in sql agent.
First off - why not run the batch multiple times a day. It would not really be "real-time" but might yield good enough real world results.
One option would be to implement master-master replication provided by the SQL engine in use. Though this probably means that some steps need to be taken to guard against duplicate IDs, auto increment mismatch etc. For example we have a master-master system set up so that one produces entries with odd IDs, the other with even.
Another approach could be that all reads are performed against local databases, and all writes are performed into the single remote master. Data would be replicated as a master-slave setup. This would provide best data consistency, but slow network would make any writes slow. We have this kind of a setup implemented atop of the master-master replication as most interaction are reads.
One real world use case I have actually come across for a similar stores/warehouse setup was based on Firebird SQL. Every single table had triggers implemented to store every action on local databases in so called log tables. And there was a replication application running at all times, regularly checking these log tables, updating the data to a remote database and pulling in new data from the remote (which had it's own log tables). But as a downside it was a horror to maintain as triggers needed to be updated when something changed in the database setup and the replication application would fail/hang at times. But data consistency was maintained well and resolved by negative IDs being used for local database and positive for master/remote. But in the end it did not really provide real "real-time".
In the end - there is no one-shoe-fits-all answer and books could probably be written on the topic. Research and Google are your friends.

Store a database table in memory in a C# website application?

I have noticed that our web application queries a particular table an enormous amount of times. The table is relatively small, with only about a hundred rows that are used.
I'm wondering if there is a way to store this table once every 15 minutes or so in memory in the website application, so the system doesn't have to make so many queries to get the same information over and over again. This would be available across many different users.
The table is the Client table, so users login from many different clients. The data is pretty static, probably getting updated perhaps once a day.
Updates: SQL profiler is showing the query is run quite a bit, so that's what concerns me. The website is not notably slow. I just thought this could help make it even faster.
If the table is small and frequently queried, there is an outstanding chance that the data and any indices is entirely in SQL Server's memory, the query plan is cached, and that the query will be extremely fast.
Measure the actual performance impact before making any changes.
If you see there is a performance impact, there are many caching strategies that you can use to reduce trips to the database. More information about access patterns to the table and the need for information consistency would be needed to recommend a specific caching strategy.
You state
to get the same information over and over again
but also state
once every 15 minutes
If the information really is the same over and over, you can load it once into the ASP.Net cache at application start. If it might change every so often, but it is OK for the data to be a little out-of-date for a given user, you can use a time-based cache expiration policy. If the data changes only every so often but must be up-to-date immediately after it changes, you can consider a SQL Dependency for cache expiration.
For more information on ASP.Net caching see
http://msdn.microsoft.com/en-us/library/xsbfdd8c(v=vs.100).aspx
and specifically
http://msdn.microsoft.com/en-us/library/6hbbsfk6(v=vs.100).aspx
My suggestion would be to create a WCF windows service - using REST you could easily cache the SQLDataReader (or other DataReader) and implement a TTL metric to re-query at an interval.
Well,there is few solutions.
If you want to load data in memory every 15 minutes you should use some of the .net caching library's,for example system .NET Caching where you could set expiration polices,and other.
You could try optimize you query with nonclustered indexes
You could use App Fabric caching,or something similar
And last,try to add more memory on sql server server

Caching some tables of a Database in RAM

I am building a database on SQL Server.
This DB is going to be really huge.
However, there are few tables which need to be queried very frequently and are quite small.
Is there a way to cache these tables in RAM for faster querying ?
Any ideas/links to make the database insertions/query faster will be highly appreciated.
Also, do I get any performance boost if I migrate from SQL Express to SQL Server Enterprise ?
Thanks in advance.
SQL server will do an outstanding job of keeping small tables that are frequently accessed in RAM.
However, a small frequently accessed table does sound like a good candidate for caching at the application layer to avoid ever hitting the database.
If your database really is "huge", you will hit the 1GB RAM limit of SQL Express (and/or the 10GB per DB storage limitation) and will want an edition that does not have that constraint.
http://msdn.microsoft.com/en-us/library/cc645993(v=SQL.110).aspx
You can read the data from the table and store into the DataTable Variable。
You Should create suitable index and you and make the query faster.
If you are working with the C# then you may have try data caching.
You just need to follow 3 steps:
Fetch your result to a list
Now cache the list of data
Whenever you need to query cache result, cast your cache object to concern list type.
Following is the example code:
List<type> result = (Linq-query).ToList();
Cache["resultSet"] = optresult;
List<type> cachedList = (List<type>)Cache["resultSet"];
Now you may perform Linq query over cachedList which actually uses cached object.
Note: For caching any object you may use more precise approach like following, this provides better control over caching.
Cache cacheObjectName = new Cache();
cacheObjectName.Insert("Key", value, Dependency, DateTime, TimeSpan, CacheItemPriority, CacheItemRemovedCallback)
More a page is used by queries more are chances that the page will be in memory.But it will be at page level rather than table level. Everytime it will be referenced its count will be increased and a background process (lazy writer) usualy decrease the count for all the pages. When a new page is required to bring to memory ;sql server will write the page with least count to disk.Thus if your table's pages are accessed frequently there are high chances that the count will be high and thus those will stay in memory for longer.But if you will have some kind of a big query which reads lots of data from different tables which say is more than your memory then even those pages might be thrown out of the cache.But if you do not have those kind of queries then the chances are high that pages will stay in the memory.
Also, it means the same page is accessed a number of times.If diff processes will read diff pages from same table then you might not have very high use count for all of your pages and thus some of them could be written to disk.
Read below blog for more details on how buffers etc works.
http://sqlblog.com/blogs/elisabeth_redei/archive/2009/03/01/bufferpool-performance-counters.aspx
Depending on how often these small tables are changed, Query Notifications might be a good option. Essentially, you subscribe your application to changes in a data set in the db. A canonical example is a list of vendors. Doesn't change much over time but you want the application to know when it does change.

Caching architecture advise for a specific scenario

SETUP:
We have a .Net application that is distributed over 6 local servers each with a local database(ORACLE), 1 main server and 1 load balance machine. Requests come to the load balancer which redirects the incoming requests to one of the 6 local servers. In certain time intervals data is gathered in the main server and redistributed to the 6 local servers to be able to make decisions with the complete data.
Each local server has a cache component that caches the incoming requests based on different parameters (Location, incoming parameters, etc). With each request a local server decides whether to go to the database (ORACLE) or get the response from the cache. However in both cases the local server has to goto the database to do 1 insert and 1 update per request.
PROBLEM:
On a peak day each local server receives 2000 requests per second and system starts slowing down (CPU: 90% ). I am trying to increase the capacity before adding another local server to the mix. After running some benchmarks the bottleneck as it always is, seems to be the inevitable 1 insert and 1 update per request to database.
TRIED METHODS
To be able decrease the frequency I have created a Windows service that sits between the DB and .NET application. It contains a pipe server and receives each insert and update from the main .NET application and saves them in a Hashtable. The new service then at certain time intervals goes to the database once to do batch inserts and updates. The point was to go to the database less frequently. Although this had a positive effect it didn't benefit to the system load as much as I expected. The most of the cpu load comes from oracle.exe as requests per second increase.
I am trying to avoid going to the database as much as I can and the only way to avoid DB seems to be increasing the cache hit ratio other than the above mentioned solution I tried. My cache hit ratio is around 81 % percent currently. Because each local machine has its own cache I am actually missing lots of cacheable requests. When two similar requests redirects to different servers the second request cannot benefit from the cached result of the first one.
I don't have a lot of experience in system architecture so I would appreciate any help to this problem. Any suggestions on different caching architectures or setup, or any tools are welcome.
Thank you in advance, hopefully I made my question clear.
For me this looks like a application for a timesten solution. In that case you can eliminate the local databases and return to just one. Where you now have the local oracle databases, you can implement a cache grid. Most likely this is going to be a AWT (Async, Write Through) cache. See Oracle In-Memory Database Cache Concepts
It's not a cheap option but if could be worth investigating.
You can keep concentrating on the business logic and have no worries about speed. This of course only works good, if the aplication code is already tuned and the sql is performant and scalable. The SQL has to be prepared (using bind variables) to have the best performance.
Your application connects to the cache and no longer to the database. You create the cache tables in the cache group for which you want to have caching. All tables in a SQL should be cached, otherwise, the complete SQL is passed through to the Oracle database. In the grid a cache fusion mechanism is in place so you have no worries about where the data in your grid is located.
In the current release support for .net is included.
The data is consistent and asynchronously updated to the Oracle database. If the data that is needed is in the cache and you take the Oracle database down, the app can keep running. As soon as the database is back again, the synchronization pick up again. Very powerful.
2000 requests per second per server, about 24000 rps to database. It's a HUGE load for DB.
Try to optimize, scaleup or clusterize database.
May be NoSQL DB (Redis\Raven\Mongo) as middleware will be suitable for you. Local server will read\write sharded NoSQL DB, aggregated data will by synchronized with Oracle off-peak times.
I know the question is old now, but I wanted let everyone know how we solved our issue.
After trying many optimizations it turned out that all we needed was Solid State Drives for the 6 local machines. The CPU dropped down to 30% percent immediately after we installed them. This is the first time that I've seen any kind of hardware update contributes this much to performance.
If you have high load setup, before making any software or architecture changes try upgrading to a SSD.
Thanks everyone for your answers.

Categories

Resources