Large Application - How to handle data access

Large Application - How to handle data access - c#

I have a large enterprise web application that is starting to be heavily used. Recently I've noticed that we are making many database calls for things like user permissions, access, general bits of profile information.
From what I can see on Azure we are looking at an average of 50,000 db queries per hour.
We are using Linq to query via the DevExpress XPO ORM. Now some of these are joins, but the majority are simple 1 table queries.
Is constantly hitting the database the best way to be accessing this kind of information? Are there ways for us to offload the database work as some of this information will never change?
Thanks in advance.

Let's start putting this into perspective. With 3600 seconds in an hour you have less than 20 operations per second. Pathetically low in any measurement.
That said, there is nothing wrong with for example caching user permissions for let's say 30 seconds or a minute.
Generally try to cache not in your code, but IN FRONT - the ASP.NET output cache and donut caching are concepts mostly ignored but still most efficient.
http://www.dotnettricks.com/learn/mvc/donut-caching-and-donut-hole-caching-with-aspnet-mvc-4
has more information. Then ignore all the large numbers and run a profiler - see what your real heavy hitters are (likely around permissions as those are used on every page). Put that into a subsystem and cache this. Given that you can preload that into user identity object in the asp.net subsystem - your code should not hit the database in the pages anyway, so the cache is isolated in some filter in asp.net.
Measure. Make sure your SQL is smart - EF and LINQ lead to extremely idiotic SQL because people are too lazy. Avoid instantiating complete objects just to throw them away, ask only for the fields you need. Make sure your indices are efficient. Come back when you start having a real problem (measured).
But the old rule is: cache early. And LINQ optimization is quite far in the back.

For getting user specific information like profile, access etc. from database, instead of fetching it for every request it is better to get information once at the time of login and keep it session. This should reduce your transactions with database

Related

Asp.Net core distributed caching

I am currently using MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()); for caching some data from database that does not change so often, but it does change.
And on create/update/delete of that data I do the refresh of the cache.
This works fine, but the problem is that on production we will have few nodes, so when method for creating of record is called for instance, cache will be refreshed only on that node, not on other nodes, and they will have stale data.
My question is, can I somehow fix this using MemoryCache, or I need to do something else, and if I do, what are the possible solutions?

I think you are looking for is Distributed Caching
Using the IDistributedCache interface you can use either Redis or Sql Server and it supplies basic Get/Set/Remove methods. Changes made on one node will be available to other nodes.
Using Redis is a great way of sharing Session type data between servers in a load balanced environment, Sql Server does not seem to be a great fit given that you seem to be caching to avoid db calls.
It might also be worth considering if you are actually complicating things by caching in the first place. When you have a single application you see the benefit, as keeping them in application memory saves a request over the network, but when you have a load balanced scenario, you have to compare retrieving those records from a distributed cached vs retrieving them from the database.
If the data is just an in memory copy of a relatively small database table, then there is probably not a lot to choose performance wise between the two. If the data is based on a complicated expensive query then the cache is the way to go.
If you are making hundreds of requests a minute for the data, then any network request may be too much, but you can consider what are the consequences of the data being a little stale? For example, if you update a record, and the new record is not available immediately on every server, does your application break? Or does the change just occur in a more phased way? In that case you could keep your in process memory cache, just use a shorter Time To Live.
If you really need every change to propagate to every node straight away then you could consider using a library like Cache Manager in conjunction with Redis which can combine an in memory cache and synchronisation with a remote cache.

Somewhat dated question, but maybe still useful: I agree with what ste-fu said, well explained.
I'll only add that, on top of CacheManager, you may want to take a look at FusionCache ⚡🦥, which I recently released.
On top of supporting an optional distributed 2nd layer transparently managed for you, it also has some other nice features like an optimization that prevents multiple concurrent factory for the same cache key from being executed (less load on the source database), a fail-safe mechanism and advanced timeouts with background factory completion
If you will give it a chance please let me know what you think.
/shameless-plug

SQL Server High Frequency Inserts

I've a system where Data is being inserted through SP that's called via WCF Service.
In system, we have currently 12000+ actively logged in Users who will be calling WCF service at every 30 seconds (effectively min 200 requests per second).
On SQL Server side, CPU Usage shoots to 100% and when I examined, > 90% of time was spent in DB Writes. This affects overall server performance.
I need suggestion to resolve this issue so that we have less DB write operations and more CPU remains free.
Am open to integrate any other DB Server, use Entity Framework or any other ORM combination if needed. I need to have solution to handle this issue.
Other information that might be helpful:
Table has no indexes defined
Database has growth factor set to 200MB.
SQL Server Version is 2012.

SImple solution: back the writes. Do not call into the sql server for every insert.
Make a service that collects them and calls them more coarsely. The main problem is that transaction handling is a little heavy cost wise - in cases like that it may make sense to batch them.
Do not call a SP for every row, load them into a temp table and then process them in bulk (or use a table variable to provide the sp with multiple lines of information at once).
This gets rid of a lot of issues, including a ton of commits (you basically ask for like 200 TPS which is quite heavy and not needed here).
How you do that is up to you - but for something that heavy I would stay away from an ORM (Entity Framework is hilarious in not batching anything - that should be tons of sp calls) and use handcrafted sql at least for this part. I love ORM's but it is always nice to have a high performance hand crafted approach when needed.

Store a database table in memory in a C# website application?

I have noticed that our web application queries a particular table an enormous amount of times. The table is relatively small, with only about a hundred rows that are used.
I'm wondering if there is a way to store this table once every 15 minutes or so in memory in the website application, so the system doesn't have to make so many queries to get the same information over and over again. This would be available across many different users.
The table is the Client table, so users login from many different clients. The data is pretty static, probably getting updated perhaps once a day.
Updates: SQL profiler is showing the query is run quite a bit, so that's what concerns me. The website is not notably slow. I just thought this could help make it even faster.

If the table is small and frequently queried, there is an outstanding chance that the data and any indices is entirely in SQL Server's memory, the query plan is cached, and that the query will be extremely fast.
Measure the actual performance impact before making any changes.
If you see there is a performance impact, there are many caching strategies that you can use to reduce trips to the database. More information about access patterns to the table and the need for information consistency would be needed to recommend a specific caching strategy.
You state
to get the same information over and over again
but also state
once every 15 minutes
If the information really is the same over and over, you can load it once into the ASP.Net cache at application start. If it might change every so often, but it is OK for the data to be a little out-of-date for a given user, you can use a time-based cache expiration policy. If the data changes only every so often but must be up-to-date immediately after it changes, you can consider a SQL Dependency for cache expiration.
For more information on ASP.Net caching see
http://msdn.microsoft.com/en-us/library/xsbfdd8c(v=vs.100).aspx
and specifically
http://msdn.microsoft.com/en-us/library/6hbbsfk6(v=vs.100).aspx

My suggestion would be to create a WCF windows service - using REST you could easily cache the SQLDataReader (or other DataReader) and implement a TTL metric to re-query at an interval.

Well,there is few solutions.
If you want to load data in memory every 15 minutes you should use some of the .net caching library's,for example system .NET Caching where you could set expiration polices,and other.
You could try optimize you query with nonclustered indexes
You could use App Fabric caching,or something similar
And last,try to add more memory on sql server server

ASP.NET Session - Use or not use and best practices for an e-commerce app

I have used ASP.NET in mostly intranet scenarios and pretty familiar with it but for something such as shopping cart or similar session data there are various possibilities. To name a few:
1) State-Server session
2) SQL Server session
3) Custom database session
4) Cookie
What have you used and what our your success or lessons learnt stories and what would you recommend? This would obviously make a difference in a large-scale public website so please comment on your experiences.
I have not mentioned in-proc since in a large-scale app this has no place.
Many thanks
Ali

The biggest lesson I learned was one I already knew in theory, but got to see in practice.
Removing all use of sessions entirely from an application (does not necessarily mean all of the site) is something we all know should bring a big improvement to scalability.
What I learnt was just how much of an improvement it could be. By removing the use of sessions, and adding some code to handle what had been handled by them before (which at each individual point was a performance lose, as each individual point was now doing more work than it had before) the performance gain was massive to the point of making actions one would measure in many seconds or even a couple of minutes become sub-second, CPU usage became a fraction of what it had been, and the number of machines and amount of RAM went from clearly not enough to cope, to be a rather over-indulgent amount of hardware.
If sessions cannot be removed entirely (people don't like the way browsers use HTTP authentication, alas), moving much of it into a few well-defined spots, ideally in a separate application on the server, can have a bigger effect that which session-storage method is used.

In-proc certainly can have a place in a large-scale application; it just requires sticky sessions at the load balancing level. In fact, the reduced maintenance cost and infrastructure overhead by using in-proc sessions can be considerable. Any enterprise-grade content switch you'd be using in front of your farm would certainly offer such functionality, and it's hard to argue for the cash and manpower of purchasing/configuring/integrating state servers versus just flipping a switch. I am using this in quite large scaled ASP.NET systems with no issues to speak of. RAM is far too cheap to ignore this as an option.

In-proc session (at least when using IIS6) can recycle at any time and is therefore not very reliable because the sessions will end when the server decides, not when the session actually times out. The sessions will also expire when you deploy a new version of the web site, which is not true of server-based session providers. This can potentially give your users a bad experience if you update in the middle of their session.
Using a Sql Server is the best option because it is possible to have sessions that never expire. However, the cost of the server, disk space, its maintenance, and peformance all have to be considered. I was using one on my E-commerce app for several years until we changed providers to one with very little database space. It was a shame that it had to go.
We have been using the state service for about 3 years now and haven't had any issues. That said, we now have the session timeout set at an hour an in E-commerce that is probably costing us some business vs the never expire model.
When I worked for a large company, we used a clustered SQL Server in another application that was more critical to remain online. We had multiple redundency on every part of the system including the network cards. Keep in mind that adding a state server or service is adding a potential single point of failure for the application unless you go the clustered route, which is more expensive to maintain.
There was also an issue when we first switched to the SQL based approach where binary objects couldn't be serialized into session state. I only had a few and modified the code so it wouldn't need the binary serialization so I could get the site online. However, when I went back to fix the serialization issue a few weeks later, it suddenly didn't exist anymore. I am guessing it was fixed in a Windows Update.

If you are concerned about security, state server is a no-no. State server performs absolutely no access checks, anybody who is granted access to the tcp port state server uses can access or modify any session state.
In proc is unreliable (and you mentioned that) so that's not to consider.
Cookies isn't really a session state replacement since you can't store much data there
I vote for a database based storage (if needed at all) of some kind, it has the best possibility to scale.

Real time data storage and access with .net

Does anyone have any experience with receiving and updating a large volume of data, storing it, sorting it, and visualizing it very quickly?
Preferably, I'm looking for a .NET solution, but that may not be practical.
Now for the details...
I will receive roughly 1000 updates per second, some updates, some new rows of data records. But, it can also be very burst driven, with sometimes 5000 updates and new rows.
By the end of the day, I could have 4 to 5 million rows of data.
I have to both store them and also show the user updates in the UI. The UI allows the user to apply a number of filters to the data to just show what they want. I need to update all the records plus show the user these updates.
I have an visual update rate of 1 fps.
Anyone have any guidance or direction on this problem? I can't imagine I'm the first one to have to deal with something like this...
At first though, some sort of in memory database I would think, but will it be fast enough for querying for updates near the end of the day once I get a large enough data set? Or is that all dependent on smart indexing and queries?
Thanks in advance.

It's a very interesting and also challenging problem.
I would approach a pipeline design with processors implementing sorting, filtering, aggregation etc. The pipeline needs an async (threadsafe) input buffer that is processed in a timely manner (according to your 1fps req. under a second). If you can't do it, you need to queue the data somewhere, on disk or in memory depending on the nature of your problem.
Consequently, the UI needs to be implemented in a pull style rather than push, you only want to update it every second.
For datastore you have several options. Using a database is not a bad idea, since you need the data persisted (and I guess also queryable) anyway. If you are using an ORM, you may find NHibernate in combination with its superior second level cache a decent choice.
Many of the considerations might also be similar to those Ayende made when designing NHProf, a realtime profiler for NHibernate. He has written a series of posts about them on his blog.

May be Oracle is more appropriate RDBMS solution fo you. The problem with your question is that at this "critical" levels there are too much variables and condition you need to deal with. Not only software, but hardware that you can have (It costs :)), connection speed, your expected common user system setup and more and more and more...
Good Luck.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.