Update:
As suggested I will ask two separated questions with elaborated details.
This is probably a two-part question and maybe a common issue however I can’t figure it out.
We have a .net web service using Entity Framework with a code-first approach against a SQL Server 2012 (I think). We have a few tables some called User, License, Product etc..
We continually need to get data from the database regarding users, their licenses and products. For this data we execute a rather large stored procedure which accesses all the tables, do some processing and deliver the data e.g. the user with this userid have these licenses with these roles in relation to these products.
However, the execution of this stored procedure seems to regress over time and it becomes slower during the day. To prevent this, we run an optimization of the indexes every morning.
If the optimization is not executed every morning the stored procedure goes from 200ms execution time to 2000 ms execution time.
If anyone has insight to what is going on I would appreciate it. My knowledge of SQL and SQL Server is limited.
HOWEVER,
To avoid these issues regarding the stored procedure we have decided to rethink our strategy. For now, we have created a new table containing the key values from the others tables e.g. userid, license id, role, productid. However, this means we have to maintain this new table every time the other tables are altered.
So, my second part question is. Is a new table containing the key values which we can easily fetch a valid approach or should we do something completely else?
The index is probably getting fragmented due to inserts or updates during the day. You could consider columnstore indexes. Try using SSMS query optimisation tool. Also consider hinting the query optimiser with loop joins if applicable.
It's very hard to answer in a vacuum. You could van index problem with heavy inserts and i am guessing GUID as a key or you could have blocking issues due to heavy loads, you could have spills to tempdb due to low memory or old statistics etc. It's all guessing game. The best advise I can give you is hire professional, because one day you would have to, at list it is not going to be as bad as people without a clue thinking that they fixing stuff (because all devs are "smart").
At list get a consultation, because without looking at the issue, it is close to impossible to give you an answer.
At the very list, you need to post a execution plan for your stored procedure.
Related
We have a requirement to pull huge data from SQL Server 2005 database for reporting purpose. Our stored procedure is returning more than 15,000 rows.
When I call the procedure from the application (MVC 4.0) the request is timing out!!! (May be because of the data size)
Is there is any best practice to read such a huge data from SQL Server 2005 database using
MVC 4.0 Application???
You're seeing a timeout because your SQL query takes a long time to finish. This is not due to the size of the result (15,000 records is not a huge amount of data), but because the query runs inefficiently.
Maybe you're missing a couple of indices, maybe the stored procedure is written the wrong way - it's impossible to know from here. Try optimizing your query or database (if you have a DBA available, they can help. If not, the Management Studio can have some tips for you).
If you can't optimize the query or the database, you're left with increasing the time out, as others suggested.
Even i faced the same problem, but i was about to render more than 1,48,000 records. So the solution for this is using multithreading. You will be having one method which fetches the data from database, call that particular method in a seperate thread. Your data will be loaded in less than 5 seconds. Multithreading has been introduced only to manipulate large number of data without lagging performance.
First Que is why you are not using Dataset and Data source view in the Reporting(If its reporting in SQL server).
If its not Reporting Services and you only want to use C# code then try to make some helper function for it.
see here for the timeout option
http://forums.asp.net/t/1040377.aspx
and also here for optimising the code and SP
enter link description here
Here are couple tips on how you can use to optimize this:
Optimize query – see if you can optimize your query in some way. Add indices to your tables, check where statements and such.. I can’t really give you any specific recommendations w/o seeing the query and knowing the schema. See what others have already suggested on this topic.
Limit the amount of data stored procedure is returning – my guess is that MVC app doesn’t really need all 15k rows but a lot more. Check out this post: efficient way to implement paging . This will not speed up the query so much but it will make the app more efficient.
I am developing a REST like API in which an initial log in call will be made, a database row created, and a "log in session key"(GUID/uniqueidentifier) will be returned to the client. This key will then be used on all subsequent calls to the API as a security check, until log-out. On EVERY API call I plan to ask the database to look up the row using the key and, if the row's time stamp has not yet expired, allow the API to serve what it needs to.
For a simple select statement that may happen hundreds of times per "log in session," would pure SQL preform better than LinqToSQL in this scenario?
Pure SQL should perform better than LinqToSQL (if anything because it doesn't need to construct and cache the query the first time), however, it depends on the number of users, the server capabilities, and how fast those "hundreds of times" happen if you should be concerned about it.
This is a long article (5 parts) from 2008 that illustrates with proper tweaking, he was able to get Linq-to-sql to perform as well or nearly as well as straight ADO.NET sql calls with less time spent on building the queries. Hope it helps!
http://blogs.msdn.com/b/ricom/archive/2007/06/22/dlinq-linq-to-sql-performance-part-1.aspx
Here is another batch of testing that actually illustrates Linq as the winner in insert scenarios and has a slight edge when reading XML files.
http://www.codeproject.com/Articles/26431/Performance-Comparisons-LINQ-to-SQL-ADO-C#_Toc193731671
I have a lot of data which needs to be paired based on a few simple criteria. There is a time window (both records have a DateTime column), if one record is very close in time (within 5 seconds) to another then it is a potential match, the record which is the closest in time is considered a complete match. There are other fields which help narrow this down also.
I wrote a stored procedure which does this matching on the server before returning the
full, matched dataset to a C# application. My question is, would it be better to pull in the 1 million (x2) rows and deal with them in C#, or is sql server better suited to perform this matching? If Sql server is, then what is the fastest way of pairing data using datetime fields?
Right now I select all records from Table 1/Table 2 into temporary tables, iterate through each record in Table 1, look for a match in Table 2 and store the match (if one exists) in a temporary table, then I delete both records in their own temporary tables.
I had to rush this piece for a game I'm writing, so excuse the bad (very bad) procedure... It works, it's just horribly inefficient! The whole SP is available on pastebin: http://pastebin.com/qaieDsW7
I know the SP is written poorly, so saying "hey, dumbass... write it better" doesn't help! I'm looking for help in improving it, or help/advice on how I should do the whole thing differently! I have about 3/5 days to rewrite it, I can push that deadline back a bit, but I'd rather not if you guys can help me in time! :)
Thanks!
Ultimately, compiling your your data on the database side is preferable 99% of the time, as it's designed for data crunching (through the use of indexes, relations, etc). A lot of your code can be consolidated by the use of joins to compile the data in exactly the format you need. In fact, you can bypass almost all your temp tables entirely and just fill a master Event temp table.
The general pattern is this:
INSERT INTO #Events
SELECT <all interested columns>
FROM
FireEvent
LEFT OUTER JOIN HitEvent ON <all join conditions for HitEvent>
This way you match all fire events to zero or more HitEvents. After our discussion in chat, you can even limit it to zero or one hit event by wrapping it in a subquery and using a window function for ROW_NUMBER() OVER (PARTITION BY HitEvent.EventID ORDER BY ...) AS HitRank and add a WHERE HitRank = 1 to the outer query. This is ultimately what you ended up doing and got the results you were expecting (with a bit of work and learning in the process).
If the data is already in the database, that is where you should do the work. You absolutely should learn to display and query plans using SQL Server Management Studio, and become able to notice and optimize away expensive computations like nested loops.
Your task probably does not require any use of temporary tables. Temporary tables tend to be efficient when they are relatively small and/or heavily reused, which is not your case.
I would advise you to try to optimize the stored procedure if is not running fast enough and not rewrite it in C#. Why would you want to transfer millions of rows out of SQL Server anyway?
Unfortunately I don't have an SQL Server installation so I can't test your script, but I don't see any CREATE INDEX statements in there. If you didn't just skipped them for brevity, then you should surely analyze your queries and see which indexes are needed.
So the answer depends on several factors like resources available per client/server (Ram/CPU/Concurrent Users/Concurrent processes, etc.)
Here are some basic rules that will improve your performance regardless of what you use:
Loading a million rows into c# program is not a good practice. Unless this is a stand alone process with plenty of ram.
Uniqueidentifiers will never out perform Integers. Comparisons
Common Table Expression are a good alternative for fast performing matching. How to use CTE
Finally you have to consider output. If there is constant reading and writing that affects the user interface, then you should manage that in memory (c#), otherwise all CRUD operations should be kept inside the database.
I wonder if somebody could point me in the right direction. I've recently started playing with LinqToSQL and love the strongly typed data objects etc.
I'm just struggling to understand the impact on database performance etc. For example, say I was developing a simple user profile page. The page shows basic information about the user, some information on their recent activity, and a list of unread notifications.
If I was developing a stored procedure for this page, I could create a single SP which returns multiple datatables covering all of the required information - resulting in a single db call.
However, using LinqToSQL, this could results in many calls - one for user info, atleast one for activity, atleast one for notifications, if I then want further info on notifications this may result in further calls - multiple db calls.
Should I be worried about the number of db calls happenning as a result of using this design pattern? Ie, are the multiple db handshakes etc going to degrade my db etc?
I'd appreciate your thoughts on this!
Thanks
David
LINQ to SQL can consume multiple results from a stored proc if you need to go that route. Unfortnately the designer has problems mapping them correctly, so you will probably need to create your mapping manually. See http://www.thinqlinq.com/Default/Using-LINQ-to-SQL-to-return-Multiple-Results.aspx.
You can configure LINQ to SQL to eagerly load the child records if you know that you're going to need them for every parent record. Use the DataLoadOptions and .LoadWith to configure it.
You can also project an object graph with multiple child collections in the Select clause of a LINQ query to reduce the number of DB hits that you make.
Ultimately, you need to check a number of options to determine which route is the best performance for your situation. It's not a one size fits all scenario.
Is it worst from a performance standpoint ? Yes, it should be. Multiple roundtrips are usually worse than single.
The real question is, do you mind? Is your application going to receive enough visits to warrant the added complexity of a stored procedure? Or do you value the simplicity of future modifications over raw performance?
In any case, if you need the performance, you can create a stored procedure and map it on your context. This will give you one single call, but return the data as objects
Here is an article explaining a bit about that option:
linq-to-sql-returning-multiple-result-sets
I have developed an network application that is in use in my company for last few years.
At start it was managing information about users, rights etc.
Over the time it grew with other functionality. It grew to the point that I have tables with, let's say 10-20 columns and even 20,000 - 40,000 records.
I keep hearing that Access in not good for multi-user environments.
Second thing is the fact that when I try to read some records from the table over the network, the whole table has to be pulled to the client.
It happens because there is no database engine on the server side and data filtering is done on the client side.
I would migrate this project to the SQL Server but unfortunately it cannot be done in this case.
I was wondering if there is more reliable solution for me than using Access Database and still stay with a single-file database system.
We have quite huge system using dBase IV.
As far as I know it is fully multiuser database system.
Maybe it will be good to use it instead of Access?
What makes me not sure is the fact that dBase IV is much older than Access 2000.
I am not sure if it would be a good solution.
Maybe there are some other options?
If you're having problems with your Jet/ACE back end with the number of records you mentioned, it sounds like you have schema design problems or an inefficiently-structured application.
As I said in my comment to your original question, Jet does not retrieve full tables. This is a myth propagated by people who don't have a clue what they are talking about. If you have appropriate indexes, only the index pages will be requested from the file server (and then, only those pages needed to satisfy your criteria), and then the only data pages retrieved will be those that have the records that match the criteria in your request.
So, you should look at your indexing if you're seeing full table scans.
You don't mention your user population. If it's over 25 or so, you probably would benefit from upsizing your back end, especially if you're already comfortable with SQL Server.
But the problem you described for such tiny tables indicates a design error somewhere, either in your schema or in your application.
FWIW, I've had Access apps with Jet back ends with 100s of thousands of records in multiple tables, used by a dozen simultaneous users adding and updating records, and response time retrieving individual records and small data sets was nearly instantaneous (except for a few complex operations like checking newly entered records for duplication against existing data -- that's slower because it uses lots of LIKE comparisons and evaluation of expressions for comparison). What you're experiencing, while not an Access front end, is not commensurate with my long experience with Jet databases of all sizes.
You may wish to read this informative thread about Access: Is MS Access (JET) suitable for multiuser access?
For the record this answer is copied/edited from another question I answered.
Aristo,
You CAN use Access as your centralized data store.
It is simply NOT TRUE that access will choke in multi-user scenarios--at least up to 15-20 users.
It IS true that you need a good backup strategy with the Access data file. But last I checked you need a good backup strategy with SQL Server, too. (With the very important caveat that SQL Server can do "hot" backups but not Access.)
So...you CAN use access as your data store. Then if you can get beyond the company politics controlling your network, perhaps then you could begin moving toward upfitting your current application to use SQL Server.
I recently answered another question on how to split your database into two files. Here is the link.
Creating the Front End MDE
Splitting your database file into front end : back end is sort of a key to making it more performant. (Assume, as David Fenton mentioned, that you have a reasonably good design.)
If I may mention one last thing...it is ridiculous that your company won't give you other deployment options. Surely there is someone there with some power who you can get to "imagine life without your application." I am just wondering if you have more power than you might realize.
Seth
The problems you experience with an Access Database shared amongst your users will be the same with any file based database.
A read will pull a lot of data into memory and writes are guarded with some type of file lock. Under your environment it sounds like you are going to have to make the best of what you have.
"Second thing is the fact that when I try to read some records from the table over the network, the whole table has to be pulled to the client. "
Actually no. This is a common misstatement spread by folks who do not understand the nature of how Jet, the database engine inside Access, works. Pulling down all the records, or excessive number of records, happens because you don't have all the fields used in the selection criteria or sorting in the index. We've also found that indexing yes/no aka boolean fields can also make a huge difference in some queries.
What really happens is that Jet brings down the index pages and data pages which are required. While this is a lot more data than a database engine would create this is not the entire table.
I also have clients with 600K and 800K records in various tables and performance is just fine.
We have an Access database application that is used pretty heavily. I have had 23 users on all at the same time before without any issues. As long as they don't access the same record then I don't have any problems.
I do have a couple of forms that are used and updated by several different departments. For instance I have a Quoting form that contains 13 different tabs and 10-20 fields on each tab. Users are typically in a single record for minutes editing and looking for information. To avoid any write conflicts I call the below function any time a field is changed. As long as it is not a new record being entered, then it updates.
Function funSaveTheRecord()
If ([chkNewRecord].value = False And Me.Dirty) Then
'To save the record, turn off the form's Dirty property
Me.Dirty = False
End If
End Function
They way I have everything setup is as follows:
PDC.mdb <-- Front End, Stored on the users machine. Every user has their own copy. Links to tables found in PDC_be.mdb. Contains all forms, reports, queries, macros, and modules. I created a form that I can use to toggle on/off the shift key bipass. Only I have access to it.
PDC_be.mdb <-- Back End, stored on the server. Contains all data. Only form and VBA it contains is to toggle on/off the shift key bipass. Only I have access to it.
Secured.mdw <-- Security file, stored on the server.
Then I put a shortcut on the users desktop that ties the security file to the front end and also provides their login credentials.
This database has been running without error or corruption for over 6 years.
Access is not a flat file database system! It's a relational database system.
You can't use SQL Server Express?
Otherwise, MySQL is a good database.
But if you can't install ANYTHING (you should get into those politics sooner rather than later -- or it WILL be later), just use you existing database system.
Basically with Access, it cannot handle more than 5 people connected at the same time, or it will corrupt on you.