Load SQL query result data into cache in advance

Load SQL query result data into cache in advance - c#

I have the following situation:
.net 3.5 WinForm client app accessing SQL Server 2008
Some queries returning relatively big amount of data are used quite often by a form
Users are using local SQL Express and restarting their machines at least daily
Other users are working remotely over slow network connections
The problem is that after a restart, the first time users open this form the queries are extremely slow and take more or less 15s on a fast machine to execute. Afterwards the same queries take only 3s. Of course this comes from the fact that no data is cached and must be loaded from disk first.
My question:
Would it be possible to force the loading of the required data in advance into SQL Server cache?
Note
My first idea was to execute the queries in a background worker when the application starts, so that when the user starts the form the queries will already be cached and execute fast directly. I however don't want to load the result of the queries over to the client as some users are working remotely or have otherwise slow networks.
So I thought just executing the queries from a stored procedure and putting the results into temporary tables so that nothing would be returned.
Turned out that some of the result sets are using dynamic columns so I couldn't create the corresponding temp tables and thus this isn't a solution.
Do you happen to have any other idea?

Are you sure this is the execution plan being created, or is it server memory caching that's going on? Maybe the first query loads quite a bit of data, but subsequent queries can use the already-cached data, and so run much quicker. I've never seen an execution plan take more than a second to generate, so I'd suspect the plan itself isn't the cause.
Have you tried running the index tuning wizard on your query? If it is the plan that's causing problems, maybe some statistics or an additional index will help you out, and the optimizer is pretty good at recommending things.

I'm not sure how you are executing your queries, but you could do:
SqlCommand Command = /* your command */
Command.ExecuteReader(CommandBehavior.SchemaOnly).Dispose();
Executing your command with the schema-only command behavior will add SET FMTONLY ON to the query and cause SQL Server to get metadata about the result set (requiring generation of the plan), but will not actually execute the command.

To narrow down the source of the problem you can always use the SQL Server Objects in perfmon to get a general idea of how the local instance of SQL Server Express is performing.
In this case you would most likely see a lower Buffer Cache Hit Ratio on the first request and a higher number on subsequent requests.
Also you may want to check out http://msdn.microsoft.com/en-us/library/ms191129.aspx It describes how you can set a sproc to run automatically when the SQL Server service starts up.
If you retrieve the Data you need with that sproc then maybe the data will remain cached and improve the performance the first time the data is retrieved by the end user via your form.

In the end I still used the approach I tried first: Executing the queries from a stored procedure and putting the results into temporary tables so that nothing would be returned. This 'caching' stored procedure is executed in the background whenever the application starts.
It just took some time to write the temporary tables as the result sets are dynamic.
Thanks to all of you for your really fast help on the issue!

Related

Set Query Timeout value in oracle database for a web application

As I'm working on a new web application project in visual studio 2003, my application has queries which is being executed. This application is connected to a DB of type Oracle.
Now, what I want to do is to set the query time out value for all queries in this application to be very large value, or infinity, so when executing a query fails, it will not directly re executing it, because if it is executing it many times, and keeps executing it, the DB overload !
We noticed that around 60 sessions are opened and executing this query at the same time, if it fails to execute!
Now, my questions:
Since I'm new to oracle, Where to find the query time out variable and change it to a very large value? And what about if I have many queries in my application? Do I need to change the time out value for each single query?
I want to know the difference between HTTP/SOAP time out value and the query time out value?
Thank you guys !
I search to use CommandTimeout property which is in the OracleCommand Class, but I don't find it.
Are there other ways to use this property to set timeout?

In the Oracle database there is no default query timeout value, but we can do such a thing using resource manager, where a query can be cancelled when the estimation is made that the query takes more than X time. It would be good to use this to protect the database server from rogue queries that no one will be waiting for. This is about the opposite of setting the timeout to unlimited and in the end makes sure that the correct working app parts are not hindered by the bad parts.
See Managing Resources with Oracle Database Resource Manager is you really want to protect your users from bad performant applications.
Most likely, an earlier timeout comes at the web tier.

Very slow T-SQL stored procedure sped up by dropping and recreating

I have a simple stored procedure in T-SQL that is instant when run from SQL Server Management Studio, and has a simple execution plan. It's used in a C# web front-end, where it is usually quick, but occasionally seems to get itself into a state where it sits there and times-out. It then does this consistently from any web-server. The only way to fix it that I’ve found is to drop and recreate it. It only happens with a single stored procedure, out of a couple of hundred similar procedures that are used in the application.
I’m looking for an answer that’s better than making a service to test it every n minutes and dropping and recreating on timeout.

As pointed out by other responses, the reasons could be many, varying from execution plan, to the actual SP code, to anything else. However, in my past experience, I faced a similar problem due to 'parameter sniffing'. Google for it and take a look, it might help. Basically, you should use local variables in your SP instead of the parameters passed in.

Not sure if my situation is too uncommon to be useful to others (It involved use of table variables inside the stored proc). But here is the story anyways.
I was working on an issue where a stored proc would take 10 seconds in most cases, but 3-4 minutes every now and then. After a little digging around, I found a pattern in the issue :
This being a stored proc that takes in a start date and and an end date, if I ran this for a year's worth of data (which is what people normally do), it ran in 10 sec. However when the query plan cache was cleared out, and if someone ran it for a day (uncommon use case), all further calls for a 1-year range would take 3-4 minutes, until I did a DBCC FREEPROCCACHE
The following 2 things were what fixed the problem
My first suspect was Parameter sniffing. Fixed it immediately using the local variable approach This, however, improved performance only by a small percentage (<10%).
In a clutching-at-straws approach, I changed the table variables that the original developer had used in this stored proc, to temp tables. This was what fixed the issue finally. Now that I know that this was the problem, I am doing some reading online, and have come across a few links such as
http://www.sqlbadpractices.com/using-table-variable-for-large-table-vs-temporary-table/
which seem to correspond with the issue I am seeing.
Happy coding!!

It's hard to say for sure without seeing SP code.
Some suggestions.
SQL server by default reuses execution plan for stored procedure. The plan is generated upon the first execution. That may cause a problem. For example, for the first time you provide input with very high selectivity, and SQL Server generates the plan keeping that in mind. Next time you pass low selectivity input, but SP reuses the old plan, causing very slow execution.
Having different execution paths in SP causes the same problem.
Try creating this procedure WITH RECOMPILE option to prevent caching.
Hope that helps.

Run SQL Profiler and execute it from the web site until it happens again. When it pauses / times out check to see what is happening on the SQL server itself.
There are lots of possibilities here depending on what the s'proc actually does. For example, if it is inserting records then you may have issues where the database server needs to expand the database and/or log file size to accept new data. If it's happening on the log file and you have slow drives or are nearing the max of your drive space then it could timeout.
If it's a select, then those tables might be locked for a period of time due to other inserts happening... Or it might be reusing a bad execution plan.
The drop / recreate dance is may only be delaying the execution to the point that the SQL server can catch up or it might be causing a recompile.

My original thought was that it was an index but on further reflection, I don't think that dropping and recreating the stored prod would help.
It most probably your cached execution plan that is causing this.
Try using DBCC FREEPROCCACHE to clean your cache the next time this happens. Read more here http://msdn.microsoft.com/en-us/library/ms174283.aspx
Even this is a reactive step - it does not really solve the issue.
I suggest you execute the procedure in SSMS and check out the actual Execution Plan and figure out what is causing the delay. (in the Menu, go to [View] and then [Include Actual Execution Plan])

Let me just suggest that this might be unrelated to the procedure itself, but to the actual operation you are trying to do on the database.
I'm no MS SQL expert, but I would'n be surprised that it behaves similarly to Oracle when two concurrent transactions try to delete the same row: the transaction that first reaches the deletion locks the row and the second transaction is then blocked until the first one either commits or rolls back. If that was attempted from your procedure it might appear as "stuck" (until the "locking" transaction is finished).
Do you have any long-running transactions that might lock rows that your procedure is accessing?

Improving search performance in large data sets

On a WPF application already in production, users have a window where they choose a client. It shows a list with all the clients and a TextBox where they can search for a client.
As the client base increased, this turns out to be exceptionally slow. Around 1 minute for a operation that happens around 100 times each day.
Currently MSSQL management studio says the query select id, name, birth_date from client takes 41 seconds to execute (around 130000 rows).
Is there any suggestions on how to improve this time? Indexes, ORMs or direct sql queries on code?
Currently I'm using framework 3.5 and LinqToSql

If your query is actually SELECT id, name, birth_date from client (ie, no where clause) there is very little that you'll be able to do to speed that up short of new hardware. SQL Server will have to do a table scan to get all of the data. Even a covering index means that it will have to scan an index just as big as the table.
What you need to ask yourself is: is a list of 130000 clients really useful for your users? I anybody really going to scroll through to the 75613th entry in a list to find the user that they want? The answer is probably not. I would go with the search option only. At least then you can add indices that make sense for those queries.
If you absolutely do need the entire list, try loading it lazily in chunks. Start with the first 500 records and then add more records as the user moves the scroll bar. That way the initial load time is reduced and the user will only load the data that is necessary.

Why do you need the list of all the clients? Couldn't you just have the search TextBox that you describe and handle the search query on the server side. There you set a cap on the maximum number of returned rows for an individual client search (e.g. max 500 matches).
Alternatively, some efficiency gains may be achived by caching the client data list on the web server

Indexing should not help, based on your query. You could use a view which caches the sorted query (assuming you're not ordering by the id?), but given SQL Server's baked-in query cache for adhoc queries you're probably not going to see much gain there either. The ORM does add some overhead, but there are several tutorials out there for cutting the cost of that (e.g. http://www.sidarok.com/web/blog/content/2008/05/02/10-tips-to-improve-your-linq-to-sql-application-performance.html). Main points there that apply to you are to use compiled queries wherever possible, and turn off optimistic concurrency for read-only data.
An even bigger performance gain could be realized by having your clients not hit the db directly. If you add a service layer in there (not necessarily a web service, but it could be) then the service class or application could put some smart caching in place, which would help by an order of magnitude for read-only queries like this.

Go in to SQL Server, do a new query. In the Query menu click the "Include Client Statistics".
Run the query just as you would from code.
It will display the results and also a tab next to the result called "Client Statistics"
Click that and look at the time in the "Wait time on server replies" This is in ms, and it's the time the server was actually executing.
I just ran this query:
select firstname, lastname from leads
It took 3ms on the server to fetch 301,000 records.
The "Total Execution Time" was something like 483ms, which includes the time for SSMS to actually get the data and process it. My query took something like 2.5-3s to run in SSMS and the remaining time (2500ms or so) was actually for SSMS to paint the results etc.)
My guess is, the 41 seconds is probably not being spent on the SQL server, as 130,000 records really isn't that much. Your 41 seconds is probably largely being spent by everything after the SQL server returns the results.
If you find out SQL Server is taking a long time to execute, in the query menu turn on "Include Actual Execution Plan" Rerun your query. A new tab appears called "Execution Plan" this tab will show you what SQL server is doing when you do a select on this table as well as a percentage of where it spends all of it's time. In my case it spent 100% of the time in a "Clustered Index Scan" of PK_Leads
Edited to include more stats

In general:
Find out what takes so much time, executing the query or retrieving the results
If its the query execution, the query plan will tell you which indexes are missing, just press the display query plan button in the SSMS and you will get hints on which indexes you should create to increase performance
If its the retrieval of the values, there is not much you can do about it besides upgrading hardware (ram, disk, network etc.)
But:
In your case it looks like the query is a full table scan, which is never good for performance, check if you really need to retrieve all this data at once.
Since there are no clauses what so ever its unlikely that its the query execution that is the problem. Meaning additional indexes will not help.
You will need to change the way the application access the data. Instead of loading all clients into memory and then search from them in memory you will need to pass on the search term to the database query.
LinqToSql enable you to use different features for searching values, here is a blog describing most of them:
http://davidhayden.com/blog/dave/archive/2007/11/23/LINQToSQLLIKEOperatorGeneratingLIKESQLServer.aspx

Optimizing SQL connection performance?

This is not a question about optimizing a SQL command. I'm wondering what ways there are to ensure that a SQL connection is kept open and ready to handle a command as efficiently as possible.
What I'm seeing right now is I can execute a SQL command and that command will take ~1s, additional executions will take ~300ms. This is after the command has previously been executed against the SQL server (from another application instance)... so the SQL cache should be fully populated for the executed query prior to this applications initial execution. As long as I continually re-execute the query I see times of about 300ms, but if I leave the application idle for 5-10 minutes and return the next request will be back to ~1s (same as the initial request).
Is there a way to via the connection string or some property on the SqlConnection direct the framework to keep the connection hydrated and ready to efficiently handle queries?

Have you checked the execution plan for your procedures. Execution plans I believe are loaded into memory on the Server and then get cleared after certain periods of time or depending on what tables etc are accessed in the procedures. We've had cases where simplifying stored procedures (perhaps splitting them) reduces the amount of work the database server has to do in calculating the plans...and ultimately reduces the first time the procedure is called...You can issue commands to force stored procedures to recompile each time for testing whether you are reducing the initial call time...
We've had cases where the complexity of a stored procedure made the database server continually have to recompile based on different parameters which drastically slowed it down, splitting the SP or simplifying large select statements into multiple update statements etc helped a considerable amount.
other ideas are perhaps intermittently calling a simple getDate() or similar every so often so that the sql server is awake (hope that makes sense)...much the same as keeping an asp.net app in memory in IIS.

The default value for open connections in a .NET connection pool is zero.
You can adjust this value in your connection string to 1 or more:
"data source=dbserver;...Asynchronous Processing=true;Min Pool Size=1"
See more about these options in MSDN.

you keep it open by not closing it. :) but that's not adviseable since connection pooling will handle connection management for you. do you have it enabled?

by default the connection pooling is enabled in ADO .NET. this will be through the connection string used by the application. More info in Using Connection Pooling with SQL Server

If you use more than one database connection, it may be more efficent. Having one database connection means the best possible access speed is always going to be limited sequentially. Whereas having >1 connections means theres an opportunity there for your compiler to optimize concurrent access a little more. I guess you're using .NET?
Also if your issuing the same SQL statement repeatedly, its possible your database server is caching the result for a short period of time, therefore making the return of the resultset quicker..

Import Process maxing SQL memory

I have an importer process which is running as a windows service (debug mode as an application) and it processes various xml documents and csv's and imports into an SQL database. All has been well until I have have had to process a large amount of data (120k rows) from another table (as I do the xml documents).
I am now finding that the SQL server's memory usage is hitting a point where it just hangs. My application never receives a time out from the server and everything just goes STOP.
I am still able to make calls to the database server separately but that application thread is just stuck with no obvious thread in SQL Activity Monitor and no activity in Profiler.
Any ideas on where to begin solving this problem would be greatly appreciated as we have been struggling with it for over a week now.
The basic architecture is c# 2.0 using NHibernate as an ORM data is being pulled into the actual c# logic and processed then spat back into the same database along with logs into other tables.
The only other prob which sometimes happens instead is that for some reason a cursor is being opening on this massive table, which I can only assume is being generated from ADO.net the statement like exec sp_cursorfetch 180153005,16,113602,100 is being called thousands of times according to Profiler

When are you COMMITting the data? Are there any locks or deadlocks (sp_who)? If 120,000 rows is considered large, how much RAM is SQL Server using? When the application hangs, is there anything about the point where it hangs (is it an INSERT, a lookup SELECT, or what?)?
It seems to me that that commit size is way too small. Usually in SSIS ETL tasks, I will use a batch size of 100,000 for narrow rows with sources over 1,000,000 in cardinality, but I never go below 10,000 even for very wide rows.
I would not use an ORM for large ETL, unless the transformations are extremely complex with a lot of business rules. Even still, with a large number of relatively simple business transforms, I would consider loading the data into simple staging tables and using T-SQL to do all the inserts, lookups etc.

Are you running this into SQL using BCP? If not, the transaction logs may not be able to keep up with your input. On a test machine, try turning the recovery mode to Simple (non-logged) , or use the BCP methods to get data in (they bypass T logging)

Adding on to StingyJack's answer ...
If you're unable to use straight BCP due to processing requirements, have you considered performing the import against a separate SQL Server (separate box), using your tool, then running BCP?
The key to making this work would be keeping the staging machine clean -- that is, no data except the current working set. This should keep the RAM usage down enough to make the imports work, as you're not hitting tables with -- I presume -- millions of records. The end result would be a single view or table in this second database that could be easily BCP'ed over to the real one when all the processing is complete.
The downside is, of course, having another box ... And a much more complicated architecture. And it's all dependent on your schema, and whether or not that sort of thing could be supported easily ...
I've had to do this with some extremely large and complex imports of my own, and it's worked well in the past. Expensive, but effective.

I found out that it was nHibernate creating the cursor on the large table. I am yet to understand why, but in the mean time I have replaced the large table data access model with straight forward ado.net calls

Since you are rewriting it anyway, you may not be aware that you can call BCP directly from .NET via the System.Data.SqlClient.SqlBulkCopy class. See this article for some interesting perforance info.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Load SQL query result data into cache in advance - c#

Related

Set Query Timeout value in oracle database for a web application

Very slow T-SQL stored procedure sped up by dropping and recreating

Improving search performance in large data sets

Optimizing SQL connection performance?

Import Process maxing SQL memory

Categories

Resources