Building dynamic SQL queries in .net 2?

Building dynamic SQL queries in .net 2? - c#

I could be re-inventing the wheel - but..
I need to allow a user to be able to build 'customer reports' from our database - which will be from a GUI.
They can't have access to SQL just a list of Tables (Data groups) and columns within those groups.
They also have the ability to create Where clauses (criteria).
I've looked around Google - but nothing cropped up.
Any ideas?

well my recommendation, Developer express have some amazing end user criteria builder, you can use theirs.
there are other controls to create end users criteria , like
http://devtools.korzh.com/query-builder-net/
I hope that help you
both controls above abstract the data acess layer so your end users wont have access to send a direct query to the database. The controls only build the criteria and its your work to send the query to the database.

As a precurser to my answer, there are a number of expensive products out there like Izenda (www.izenda.com) that will do this very elegantly...
If you want to roll your own (and to speak to your question) you can fake this pretty quickly (and yes, this does not scale well to more than about 4 joins) like this:
Create a join statement that encompasses all of your tables that you want to use.
Create a dictionary of all available fields you want to expose to the user such as this: Dictionary = Dictionary<[Pretty displayable name], [fully qualified Sql field name]>
Let the user build a select list of fields they want to see and conditions they want add from the dictionary above and use the dictionary value to concat the sql string together that is necessary to return their results.
(I'm skipping quite a bit of validation work about making sure that the user doesn't try to mis-type the condition and such, but essentially point of this is that for a small collection of tables you can create a static "from" statement and then safely tack on the concat'ed "select" and "where" that the user builds)
Note that I've worked on some systems that actually stored the relationships of the table and compiled the most efficient "from" statement possible... that is not a huge stretch from this answer, it's just a bit more work.

I strongly recommend going with an existing product like Crystal Reports, Sql Server Report Builder, or Infomaker. It's just so easy to get something that seems to work, but leaves you open for an sql injection attack.
If you do go ahead, I recommend using a separate sql connection for these reports. This connection should have a user account that only has read privileges anywhere in the database.

Thanks for the answers! We ended up doing this ourselves through a collection of views!
For instances:
Sales View
Customer View
The views already take care of most of the joining between tables and return as much joined data as they can.
The user then selects what columns they could like to see from each view and we do the join between the views at code level.
The resulting statement is very small as the views take most of the work out of it.

Related

Preventing SQL injection while letting the user enter their own condition at the end of the query

In my program I have a query that views Contacts information to the user. This table normally shows all the rows with no filter. I would like to add an option for the user to write their own condition at the end of the SELECT statement to view the information that they need. So if my main query is something like this:
SELECT * FROM CONTACTS
The user can write the condition
WHERE FirstName LIKE '%Michael%'
In a textbox.
However, I am aware that this is not very safe and is prone to SQL Injection. But how can I prevent the user from entering malicious commands such as
WHERE 1=1; DROP TABLE Contacts
In the text box? For now I am using a check against some keywords e.g. if the filter contains DELETE, DROP, UPDATE, etc the query will not run. But I don't think this is a very safe solution.

Allowing a user to write "their own condition at the end of a {SQL} query", can create security holes in your application which will make it prone to things like a SQL injection attack.
If you still wish to proceed, here are a couple of things to consider:
use regular expressions to limit input to its most basic form (e.g. phone number only allows 0-9 and hyphens)
implement your protection mechanism at the lowest level (i.e. stored procedure)
for dynamic queries in stored procedures... never pass in field names into the stored procedure
Never run with more privileges than necessary.
Users that log into an application with their own login should normally only have EXEC permissions on stored procedures.
If you use dynamic SQL, it should be confined to reading operations so that users only need SELECT permissions.
A web site that logs into a database should not have any elevated privileges, preferably only EXEC and (maybe) SELECT permissions.
Never let the web site log in as administrator!
Always used parameterized statements.
Do a code review to check for the possibility of second-order attacks.
Ensure that error messages give nothing away about the internal architecture of the application or the database.
Again... be very, VERY careful when you implement this!
ADDITIONAL READING
Wikipedia: SQL injection attack
The Curse and Blessings of Dynamic SQL
SQL Injection Attacks and Some Tips on How to Prevent Them
Dynamic Search Conditions in T‑SQL
Top 10 tricks to exploit SQL server systems

You're right that this isn't safe. There is no safe way of doing exactly what you want. Arbitrary SQL filters means the user could equally use a WHERE id IN (SELECT x FROM OPENROWSET(...)) selection, for instance, still allowing DROP TABLE executions.
What you can do is provide your own filter syntax, and using your own parser for that syntax, translate it to SQL. You can make sure only to allow features that are safe to use from SQL. Some ORMs may provide such a feature out of the box, otherwise you'll have to create something yourself.

I'd consider this feature a no-go as it is designed: Not only is the user expected to have knowledge about how the database works, he also needs to know both the correct syntax and the data model.
Try to mask this by providing predefined conditions, such as "where the user name contains..." or "the last name is".
On the C# side use parameterized queries to make sure the user provided input is sanitized.

This is impossible to answer. You want a search engine that will allow SQL conditions and this is the definition of SQL injection.
Here are a few workaround ideas :
Use a grid control : You can display the query results inside a grid and allow the user to
use the grid to filter the results. This is easy to implement (you don't reinvent the wheel) and it is user friendly. Moreover, some grids offer very powerful filtering options. The only drawback I can see is that you'll often retrieve much more results from the database that what you actually needs.
Create your own filter syntax : You can code your own filter syntax (hvd solution). This is going to be a lot of work and if you miss something you might still end up with a security hole.
Code a condition build-up tool : You can provide a condition build up tool. This is very user friendly but in the end it may not be flexible enough.
Export to CSV : You can offer a tool to export the query results to CSV for easy exploitation in Excel or Calc. This would be very user friendly for experienced users with spreadsheet applications.

How to reduce database hits in linq to sql

I have a scenario and i want to know bestpractices to reduce database hits. Scenario is I have a dictionary table in my application where i put all the words/keywords for translation purpose because my system is multilingual.
Keywords are placed all over the page they can be 10 to 20 in one page and on each word it fetches the translation from database if user in not viewing english version of website.
My application in on Asp.Net MVC 2 with C# and LINQ2SQL.

Caching is a good way to reduce database queries. There are 2 levels of cache you could use:
Cache objects (for example results of database queries)
Cache HTML output of entire controller actions or partials

The translation typically don't change very often and the amount of data is limited. Read up all translated strings when the web app is started and put them in a globally accessible Dictionary. Whenever you need the translated strings, look them up in the dictionary.

linq will lazy load, which means the queries won't hit the database unless you access a property returned by the query, so make sure you avoid accessing property before they are really needed.
you could also try to combine linq queries into one and have a look at your loops to make sure there isn't a better way to cycle through your queries.
you should also be able to remove database access altogether and use translation files in xml rather than on a database.

Before you can do things like caching and lazy loading, etc... it's best to figure out WHAT is going wrong.
Enter LinqToSql Profiler. Yes, it's a commercial product .. but it's worth it. Also, it has a DEMO period ..
This can show you the crap performing queries .. and which queries are doing N+1, etc....

SQL query builder WPF

I need to give my customers an ability to select virtually any data from the database. They are regular users, don't know SQL, know nothing about tables, relations etc.
Is there some component/tool with simple GUI that I can customize for my database structure?
Right now I need it for WPF project, but I'm also interested in ASP.NET tool for future.
Thanks

You could provide your users with a flattened view of the data that denormalizes across relationships (since regular users don't 'get' relationships) and provide them with the ability to choose which columns the want to see and which values they want to filter on

Try EasyQuery.NET.
It seems they have released WPF version recently (I'm subscribed to their newsletter).

Method of simulating views on a SQL Server with read-only access?

I'm trying to do quite a lot of querying against a Microsoft SQL Server that I only have read only access to. My queries are needing to work with the data in very different structure from how the DB architecture is. Since I only have read access, I can't create views, so I'm looking for a solution.
What I'm currently doing is using complex queries to return the results as I need them, but this is 4-5 table joins with subqueries. It is rediculously slow and resource intensive.
I can see two solutions, but would love to hear about anything I might have missed:
Use some sort of "proxy" that caches the data, and creates views around it. This would need some sort of method to determine the dirtiness of the data. (is there something like this?)
run my own SQL server, and mirror the data from the source SQL server every X minutes, and then load views on my SQL server.
Any other ideas? or recommendations on these ideas?
Thanks!

Here are some options for you:
Replication
Set up replication to move the data to your own SQL Server and create any views you need over there. An administrator has to set this up. If you need to see the data as it changes, use Transactional Replication. If not, you can do snapshots.
Read more here: http://technet.microsoft.com/en-us/library/ms151198.aspx
New DB on same instance
Get a new database MyDB on the same server as your ProductionDB with WRITE access for you. Create your views there.
Your view creation can look like this:
USE MyDB
GO
CREATE VIEW DBO.MyView
AS
SELECT Column1, Column2, Column3, Column4
FROM ProductionDB.dbo.TableName1 t1
INNER JOIN ProductionDB.dbo.TableName2 t2
ON t1.ColX = T2.ColX
GO
Same Instance, not same Server + Difference instance: I would suggest to create the MyDB on the same instance of SQL Server as ProductionDB rather than install a new instance. Multiple instances of SQL Server on a single machine is much more expensive in terms of resources than a new DB on the same instance.
Standard Reusable Views
Create a set of standardized views and ask the administrators to put them on the read only server for you and reuse those views in queries

you can also use a CTE which can act like a view.
I will go for that if Raj More's #2 suggestion does not work for you...
WITH myusers (userid, username, password)
AS
(
-- this is where the definition of view would go.
select userid, username, password from Users
)
select * from myusers

If you can create a new database on that server you can create the views in the new database. The views can access the data using a three part name. E.g. select * from OtherDB.dbo.Table.
If you have access to another SQL server, the DBA can created a "Linked Server". You can then create views that access the data using a four part name. E.g. select * from OtherServer.OtherDB.dbo.Table
In either case, the data is always "live", so no need to worry about dirty data.
The views will bring you cleaner code and a single location to make changes, and few milliseconds of performance benefit from cached execution plans. However, there shouldn't be in great performance leaps. You mention caching, but as far as I know, the server does not do any particular data caching for ordinary, non-indexed views that it wouldn't do for ad-hoc queries.
If you haven't already done so, you may wish to do experiments to see if the views are actually faster--make a copy of the database and add the views there.
Edit: I did a similar experiment today. I had a stored proc on Server1 that was getting data from Server2 via a Linked Server. It was a complex query, joining many tables on both servers. I created a view on Server2 that got all of the data that I needed from that server, and updated the proc (on Server1) so that it used that view (via a Linked Server) and then joined the view to a bunch of tables that were on Server1. It was noticeably faster after the update. The reason seems to be that Server1 was miss-estimating the number of rows that it would get from Server2, and thus building a bad plan. It did better estimating when using a view. It didn't matter if the view was in the same database as the data it was reading, it just had to be on the same server (I only have on instance, so I don't know how instances would have come into play).
This particular scenario would only come into play if you were already using Linked Servers to get the data, so it may not be relevant to the original question, but I thought it was interesting since we're discussing the performance of views.

You could ask DBA to create a schema for people like you "Contractors" and allow you to create objects inside that schema only.

I would look at the query plan in Management studio and see if you can tell why its not performing well. Maybe you need to rewrite your query. You might also make use of table level variables as temporary tables to store intermediate results if that helps. Just make sure you're not storing a lot of records in them. You can run multiple statements in a batch like this:
DECLARE #tempTable TABLE
(
col1 int,
col2 varchar(250)
)
INSERT INTO #tempTable (col1, col2)
SELECT a, b
FROM SomeTable
WHERE a < 100 ... /* some complex query */
SELECT *
FROM OtherTable o
INNER JOIN #tempTable T
ON o.col1 = T.col1
WHERE ...

By using views, your queries would not perform better. You need to tune those queries, and probably some indexes should be made on those tables, to support your queries.
If you cannot get access to the database, in order to create those indexes, you can "cache" the data in a new database you create, and tune your queries in this new one. And of course, you will have to implement some synchronization, to keep the cached data up to date.
This way you won't see the changes made to the original database immediately (there will be a latency), but you can get your queries perform a lot faster, and you can even create those views, if you wish.

How to prevent Sql-Injection on User-Generated Sql Queries

I have a project (private, ASP.net website, password protected with https) where one of the requirements is that the user be able to enter Sql queries that will directly query the database. I need to be able to allow these queries, while preventing them from doing damage to the database itself, and from accessing or updating data that they shouldn't be able to access/update.
I have come up with the following rules for implementation:
Use a db user that only has permission for Select Table/View and Update Table (thus any other commands like drop/alter/truncate/insert/delete will just not run).
Verify that the statement begins with the words "Select" or "Update"
Verify (using Regex) that there are no instances of semi-colons in the statement that are not surrounded by single-quotes, white space and letters. (The thought here is that the only way that they could include a second query would be to end the first with a semi-colon that is not part of an input string).
Verify (using Regex) that the user has permission to access the tables being queried/updated, included in joins, etc. This includes any subqueries. (Part of the way that this will be accomplished is that the user will be using a set of table names that do not actually exist in the database, part of the query parsing will be to substitute in the correct corresponding table names into the query).
Am I missing anything?
The goal is that the users be able to query/update tables to which they have access in any way that they see fit, and to prevent any accidental or malicious attempts to damage the db. (And since a requirement is that the user generate the sql, I have no way to parametrize the query or sanitize it using any built-in tools that I know of).

This is a bad idea, and not just from an injection-prevention perspective. It's really easy for a user that doesn't know any better to accidentally run a query that will hog all your database resources (memory/cpu), effectively resulting in a denial of service attack.
If you must allow this, it's best to keep a completely separate server for these queries, and use replication to keep it pretty close to an exact mirror of your production system. Of course, that won't work with your UPDATE requirement.
But I want to say again: this just won't work. You can't protect your database if users can run ad hoc queries.

what about this stuff, just imagine the select is an EXEC
select convert(varchar(50),0x64726F70207461626C652061)

My gut reaction is that you should focus on setting the account privileges and grants as tightly as possible. Look at your RDBMS security documentation thoroughly, there may well be features you are not familiar with that would prove helpful (e.g. Oracle's Virtual Private Database, I believe, may be useful in this kind of scenario).
In particular, your idea to "Verify (using Regex) that the user has permission to access the tables being queried/updated, included in joins, etc." sounds like you would be trying to re-implement security functionality already built into the database.

Well, you already have enough people telling you "dont' do this", so if they aren't able to dissuade you, here are some ideas:
INCLUDE the Good, Don't try to EXCLUDE the bad
(I think the proper terminology is Whitelisting vs Blacklisting )
By that, I mean don't look for evil or invalid stuff to toss out (there are too many ways it could be written or disguised), instead look for valid stuff to include and toss out everything else.
You already mentioned in another comment that you are looking for a list of user-friendly table names, and substituting the actual schema table names. This is what I'm talking about--if you are going to do this, then do it with field names, too.
I'm still leaning toward a graphical UI of some sort, though: select tables to view here, select fields you want to see here, use some drop-downs to build a where clause, etc. A pain, but still probably easier.

What you're missing is the ingenuity of an attacker finding holes in your application.
I can virtually guarantee you that you won't be able to close all the holes if you allow this. There might even be bugs in the database engine you don't know about but they do that allows an SQL statement you deem safe to wreck havoc in your system.
In short: This is a monumentally bad idea!

As the others indicate, letting end-users do this is not a good idea. I suspect the requirement isn't really that the user really needs ad-hoc SQL, but rather a way to get and update data in ways not initially forseen. To allow queries, do as Joel suggests and keep a "read only" database, but use a reporting application such as Microsoft Reporting Services or Data Dynamics Active reports to allow users to design and run ad-hoc reports. Both I believe have ways to present users with a filtered view on "their" data.
For the updates, it is more tricky- I don't know of existing tools to do this. One option may be to design your application so that developers can quickly write plugins to expose new forms for updating data. The plugin would need to expose a UI form, code for checking that the current user can execute it, and code for executing it. Your application would load all plugins and expose the forms that a user has access to.

Event seemingly secure technology like Dynamic LINQ, is not safe from code injection issues and you are talking about providing low-level access.
No matter how hard you sanitize queries and tune permissions, it probably will still be possible to freeze your DB by sending over some CPU-intensive query.
So one of the "protection options" is to show up a message box telling that all queries accessing restricted objects or causing bad side-effects will be logged against user's account and reported to the admins immediately.
Another option - just try to look for a better alternative (i.e. if you really need to process & update data, why not expose API to do this safely?)

One (maybe overkill) option could be use a compiler for a reduced SQL language. Something like using JavaCC with a modified SQL grammar that only allows SELECT statements, then you might receive the query, compile it and if it compiles you can run it.
For C# i know Irony but never used it.

You can do a huge amount of damage with an update statement.
I had a project similar to this, and our solution was to walk the user through a very annoying wizard allowing them to make the choices, but the query itself is constructed behind the scenes by the application code. Very laborious to create, but at least we were in control of the code that finally executed.

The question is, do you trust your users? If your users have had to log into the system, you are using HTTPS & taken precautions against XSS attacks then SQL Injection is a smaller issue. Running the queries under a restricted account ought to be enough if you trust the legitimate users. For years I've been running MyLittleAdmin on the web and have yet to have a problem.
If you run under a properly restricted SQL Account select convert(varchar(50),0x64726F70207461626C652061) won't get very far and you can defend against resource hogging queries by setting a short timeout on your database requests. People could still do incorrect updates, but then that just comes back to do you trust your users?
You are always taking a managed risk attaching any database to the web, but then that's what backups are for.

If they don't have to perform really advanced queries you could provide a ui that only allows certain choices, like a drop down list with "update,delete,select" then the next ddl would automatically populate with a list of available tables etc.. similar to query builder in sql management studio.
Then in your server side code you would convert these groups of ui elements into sql statements and use a parametrized query to stop malicious content

This is a terribly bad practice. I would create a handful of stored procedures to handle everything you'd want to do, even the more advanced queries. Present them to the user, let them pick the one they want, and pass your parameters.
The answer above mine is also extremely good.

Although I agree with Joel Coehoorn and SQLMenace, some of us do have "requirements". Instead of having them send ad Hoc queries, why not create a visual query builder, like the ones found in the MS sample applications found at asp.net, or try this link.
I am not against the points made by Joel. He is correct. Having users (remember we are talking users here, they could care less about what you want to enforce) throw queries is like an app without a "Business Logic Layer", not to mention the additional questions to be answered when certain results does not match other supporting application results.

here is another example
the hacker doesn't need to know the real table name, he/she can run undocumented procs like this
sp_msforeachtable 'print ''?'''
just instead of print it will be drop

Plenty of answers saying that it's a bad idea but somethimes that's what the requirements insist on. There is one gotcha that I haven't spotted mentioned in the "If you have to do it anyway" suggestions though:
Make sure that any update statements include a WHERE clause. It's all too easy to run
UPDATE ImportantTable
SET VitalColumn = NULL
and miss out the important
WHERE UserID = #USER_NAME
If an update is required across the whole table then it's easy enough to add
WHERE 1 = 1
Requiring the where clause doesn't stop a malicious user from doing bad things but it should reduce accidental whole table changes.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.