Related
I'm writing an ASP.NET MVC4 application which ultimately builds a SQL SELECT statement dynamically to be stored and executed at a later time. The structure of the dynamic SQL is determined by user configuration in a user-friendly manner, with standard checkboxes, dropdowns, and freeform-entry textboxes. It would be simple enough to validate the input and build the SQL string using parameterized queries, except that I need to allow advanced users the ability to enter custom SQL to be injected directly into the SELECT and WHERE clauses. So what techniques can I use to cleanse the custom SQL expressions or otherwise guard against unwanted input from a clever user? I can easily enough parse the string for suspect keywords and blacklist insert/update/delete/etc., but something tells me that's not going to protect me 100%.
I'm happy to provide more details about exactly what I'm doing here, but I'm not sure what other details would be helpful, since I feel like my problem, while probably not common, is pretty generic.
Contrary to other views, it is possible to do this safely (just look at Data Explorer). Here are the four things you can do to make it happen:
Account Security
Sql Server will let you restrict permissions available to the account used to make the connection to the database. Only grant read permissions, and only to the appropriate tables. Then someone could inject malicious code 'til they're blue in the face, but it will fail at the compile step because they don't have enough permissions. That may mean using a different connection string for this access than for other parts of the application.
Note that this is good for sql code, but there are other kinds of injection as well. If you ever show the query on the page (in full or part) back to the user before running it, you should also look out for javascript injection attacks such as cross site scripting.
Query Governor
You also want to defend against denial of service attacks. Sql databases make it easy for these to happen even by accident, simply from constructing an inefficient query. To combat this threat, you should take a look at the query governor feature in Sql Server. Note that tuning this thing is tricky.
Quarantine
The safest course is to also use a dedicated, sanitized reporting database, hosted on a dedicated server. This ensures that no query can impact production, either in terms of performance or a breached server or account. There are features in sql server such as SSIS that you can use to automate populating your reporting DB from production.
Use a Product
One of the things about security is that you never want to find yourself building your own secured system. It's easy to create something that seems to work, passes tests, but is flawed in subtle ways that lead to breaches later. You want to rely on a product from a vendor that does this kind of thing as a core competency. This means it's battle tested (inevitable bugs are found and fixed) and if there is a flaw, that flaw might end up exposed for someone else first. It also means the vendor can provide support and a level of indemnification. Normally I talk about this in terms of authentication systems, but this rule can apply to sql injection defense as well.
In this case, it might be worth checking out Stack Overflow's Data Explorer. This is a tool to allow the construction of arbitrary queries by untrusted users. The project is open source, so you can see for yourself what they've done to ensure safety, or even just fork that project for your own use. It's worth mentioning again that a big part of this tool's safety is that it is intended for use on a dedicated, sanitized database, so it does not exempt you from the other items.
When all is said and done, though, I think the comment to set up some views and provide access via reporting services is probably your best bet.
I think you can only achieve what you need by not providing a direct way to specify SQL.
That doesn't have to be a problem however. Basically you only need a subset of SQL for defining expressions (computed solumns in the SELECT) and predicates (for filtering in the WHERE). This can be handled with a self-made parser which has just enough power to do everything needed, but then "compiles" the expressions into real SQL code, without actually injecting anything directly.
This then validates the syntax and makes sure that no constructs can be fabricated which would allow for any injection. Also, it allows you to properly let the code generator parametrize the queries you generate as needed (for instance, all literals could be passed in as query parameters by your code generator).
I would never ever let users to run custom SQL statements against production server.
Which bad guys directly can attempt to read/write/modify on tables which they are not supposed to or other bad guys can compromise weak accounts and run SQL statements against to your server.
If you have pre-built system which creates sql statement according to html inputs then you can generate script on the back-end which will not hurt your database. Otherwise if users want something very custom, then let them upload their script, review it with your SQL Administrator and make sure it is "safe to execute" and run it to be scheduled.
If script is denied then let user know about it, and explain why their custom script is denied.
Use stored procedures, so any thing that comes from any user will be inserted as a parameter and never be executed.
Ok guys, another my question is seems to be very widely asked and generic. For instance, I have some accounts table in my db, let say it would be accounts table. On client (desktop winforms app) I have appropriate functionality to add new account. Let say in UI it's a couple of textboxes and one button.
Another one requirement is account uniqueness. So I can't add two same accounts. My question is should I check this account existence on client (making some query and looking at result) or make a stored procedure for adding new account and check account existence there. As it for me, it's better to make just a stored proc, there I can make any needed checks and after all checks add new account. But there is pros and cons of that way. For example, it will be very difficult to manage languagw of messages that stored proc should produce.
POST EDIT
I already have any database constraints, etc. The issue is how to process situation there user is being add an existence account.
POST EDIT 2
The account uniqueness is exposed as just a simple tiny example of business logic. My question is more abour handling complicated business logic on that accounts domain.
So, how can I manage this misunderstanding?
I belive that my question is basic and has proven solution. My tools are C#, .NET Framework 2.0. Thanks in advance, guys!
If the application is to be multi-user ( i.e. not just a single desktop app with a single user, but a centralised DB with the app acting as clients maybe on many workstations), then it is not safe to rely on the client (app) to check for such as uniqueness, existance, free numbers etc as there is a distinct possibility of change happening between calls (unless read locking is used, but this often become more of an issue than a help!).
There is the ability of course to precheck and then recheck (pre at app level, re at DB), but of course this would give extra DB traffic, so depends on whether it is a problem for you.
When I write SPROCs that will return to an app, I always use the same framework - I include parameters for a return code and message and always populate them. Then I can use standard routines to call them and even add in the parameters automatically. I can then either display the message directly on failure, or use the return code to localize it as required (or automate a response). I know some DBs (like SQL Svr) will return Return_Code parameters, but I impliment my own so I can leave inbuilt ones for serious system based errors and unexpected failures. Also allows me to have my own numbering systems for return codes (i.e. grouping them to match Enums in the code and/or grouping by severity)
On web apps I have also used a different concept at times. For example, sometimes a request is made for a new account but multiple pages are required (profile for example). Here I often use a header table that generates a hidden user ID against the requested unique username, a timestamp and someway of recognising them (IP Address etc). If after x hours it is not used, the header table deletes the row freeing up the number (depending on DB the number may never become useable again - this doesn;t really matter as it is just used to keep the user data unique until application is submitted) and the username. If completed correctly, then the records are simply copied across to the proper active tables.
//Edit - To Add:
Good point. But account uniqueness is just a very tiny simple sample.
What about more complex requirements for accounts in business logic?
For example, if I implement in just in client code (in winforms app) I
will go ok, but if I want another (say console version of my app or a
website) kind of my app work with this accounts I should do all this
logic again in new app! So, I'm looking some method to hold data right
from two sides (server db site and client side). – kseen yesterday
If the requirement is ever for mutiuse, then it is best to separate it. Putting it into a separate Class Library Project allows the DLL to be used by your WinForm, Console program, Service, etc. Although I would still prefer rock-face validation (DB level) as it is closest point in time to any action and least likely to be gazzumped.
The usual way is to separate into three projects. A display layer [DL] (your winform project/console/Service/etc) and Business Application Layer [BAL] (which holds all the business rules and calls to the DAL - it knows nothing about the diplay medium nor about the database thechnology) and finally the Data Access Layer [DAL] (this has all the database calls - this can be very basic with a method for insert/update/select/delete at SQL and SPROC level and maybe some classes for passing data back and forth). The DL references only the BAL which references the DAL. The DAL can be swapped for each technology (say change from SQL Server to MySQL) without affecting the rest of the application and business rules can be changed and set in the BAL with no affect to the DAL (DL may be affected if new methods are added or display requirement change due to data change etc). This framework can then be used again and again across all your apps and is easy to make quite drastic changes to (like DB topology).
This type of logic is usually kept in code for easier maintenance (which includes testing). However, if this is just a personal throwaway application, do what is most simple for you. If it's something that is going to grow, it's better to put things practices in place now, to ease maintenance/change later.
I'd have a AccountsRepository class (for example) with a AddAcount method that did the insert/called the stored procedure. Using database constraints (as HaLaBi mentioned), it would fail on trying to insert a duplicate. You would then determine how to handle this issue (passing a message back to the ui that it couldn't add) in the code. This would allow you to put tests around all of this. The only change you made in the db is to add the constraint.
Just my 2 cents on a Thrusday morning (before my cup of green tea). :)
i think the answer - like many - is 'it depends'
for sure it is a good thing to push logic as deeply as possible towards the database. This prevent bad data no matter how the user tries to get it in there.
this, in simple terms, results in applications that TRY - FAIL - RECOVER when attempting an invalid transaction. you need to check each call(stored proc, or triggered insert etc) and IF something bad happens, recover from that condition. Usually something like tell the user an issue occurred, reset the form or something, and let them try again.
i think at a minimum, this needs to happen.
but, in addition, to make a really nice experience for the user, the app should also preemptively check on certain data conditions ahead of time, and simply prevent the user from making bad inserts in the first place.
this is of course harder, and sometimes means double coding of business rules (one in the app, and one in the DB constraints) but it can make for a dramatically better user experience.
The solution is more of being methodical than technical:
Implement - "Defensive Programming" & "Design by Contract"
If the chances of a business-rule being changed over time is very less, then apply the constraint at database-level
Create a "validation or rules & aggregation layer (or class)" that will manage such conditions/constraints for entity and/or specific property
A much smarter way to do this would be to make a user-control for the entity and/or specific property (in your case the "Account-Code"), which would internally use the "validation or rules & aggregation layer (or class)"
This will allow you to ensure a "systematic-way-of-development" or a more "scalable & maintainable" application-architecture
If your application is a website then along with placing the validation on the client-side it is always better to have validation even in the business-layer or C# code as well
When ever a validation would fail you could implement & use a "custom-error-message" library, to ensure message-content is standard across the application
If the errors are raised from database itself (i.e., from stored-procedures), you could use the same "custom-error-message" class for converting the SQL Exception to the fixed or standardized message format
I know that this is all a bit too much, but is will always good for future.
Hope this helps.
As you should not depend on a specific Storage Provider (DB [mysql, mssql, ...], flat file, xml, binary, cloud, ...) in a professional project all constraint should be checked in the business logic (model).
The model shouldn't have to know anything about the storage provider.
Uncle Bob said something about architecture and databases: http://blog.8thlight.com/uncle-bob/2011/11/22/Clean-Architecture.html
I'm just starting to work on a logging library that everyone can use to keep track of any sort of system information while the user is running our application. The simplest example so far is to track Info, Warnings, and Errors.
I want all plugins to be able to use this feature, but since each developer might have a different idea of what's important to report, I want to keep this as generic as possible.
In the C++ world, I would normally use something like a stl::pair<string,string> to act as a key value pair structure, and have a stl::list of these to act as a "row" in the log. The log cache would then be a list<list<pair<string,string>>> (ugh!). This way, the developers can use a const string key like INFO, WARNING, ERROR to have a consistent naming for a column in the database (for SELECTing specific types of information).
I'd like the database to be able to deal with any number of distinct column names. For example, John might have an INFO row with a column called USER, and Bill might have an INFO row with a column called FILENAME. I want the log viewer to be able to display all information, and if one report doesn't have a value for INFO / FILENAME, those fields should just appear blank. So one option is to use List<List<KeyValuePair<String,String>>>, and the another is to have the log library consumer somehow "register" its schema, and then have the database do an ALTER TABLE to handle this situation. Yet another idea is to have a table that's just for key value pairs, with a foreign key that maps the key value pairs back to the original log entry.
I obviously don't want logging to bog down the system, so I only lock the log cache to make a copy of the data (and remove the already-copied data), then a background thread will dump the information to the database.
My specific questions regarding this are:
Do you see any performance issues? In other words, have you ever tried something like this and found that certain things just don't work well in practice?
Is there a more .NETish way to implement the key value pairs, other than List<List<KeyValuePair<String,String>>>?
Even if there is a way to do #2 better, is the ALTER TABLE idea I proposed above a Bad Thing?
Would you recommend multiple databases over a single one? I don't yet have an idea of how frequently the log would get written to, but we ideally would like to have lots of low level information. Perhaps there should be a DB with a fixed schema only for the low level stuff, and then another DB that's more flexible for reporting information back to users.
why don't you check log4net? It may be enough for your purposes and you would avoid re invent a wheel already invented many times :-)
Here you have some configuration examples about how to store the logging information on the database:
http://logging.apache.org/log4net/release/config-examples.html
Like others have already noted, there are several popular logging frameworks that have a lot of built-in functionality. While none of them have the flexibility you desire, my experience is that you never really need that kind of flexability. It's only logging :-).
Here is a list of some common logging libraries:
log4net
NLog
Enterprise Library
Termite
ELMAH
CuttingEdge.Logging
(did I miss one?)
And when you have trouble chosing one, use a logging facade to hide the implementation. For this you can pick:
Common.Logging
Simple Logging Facade.
Not until there are some (Knuth).
Create models that actually model your log entries, then cache them in a List
I'd provide standard fields and store all user configurables as XML. Allows for a flexible logging system, doesn't result in schema alterations, and is (at least in Sql Server) searchable.
More databases is more maintenance. Keep it simple. In fact, just use Log4net.
I'm currently involved in a very large supply chain management software system internal to where I'm employed. The system's UI is currently only implemented through ASP.NET, but we're in development of Windows Forms and Windows Mobile Compact interfaces as well. We have a pretty good setup in terms of separating the interface, business, and data access layers, so we have successfully shared across multiple platforms. However, we have some security concerns for when we distribute our client-based interfaces to the customer.
Several of our data access libraries are distributed with the executable. Simply opening the compiled assembly in Notepad gives full view to any queries within.
For example, let's say we have a class called "User" who implements the method "GetName" as:
select name from user where id = #id
The problem is that anyone keen enough to open the compiled assembly in Notepad can now see column and table names. Sure, they may not have access to these, but I'd still rather not expose the schema if I don't have to.
The above is just a simple example. Am I going about the thought process incorrectly, or is there a way to protect our queries? (I'd rather not resort to using stored procedures for everything.)
I've thought of forcing out data access layer to be remote and communicating from the business layer via web services so that all database related information is on our internal server that we can protect easier.
If you want to remove the sql from the src, then you are looking at another layer like web services. While that hides your sql, the services themselves must now be public. So while those who peek cannot see the db schema, they can still see the data layout.
What the web services allow for is an easier way to make schema changes since now you just have to make sure the data output is always the same. It also allows for you to move, rename, and/or perform other maintanence with the schema's dbs. Finally, it would better allow you to pool db connections local instead of over a network and have more processing run at the server.
What is the best way in logging actions, activities, etc done in an asp.net application. Also, which storage is best for logging these? XML? DB?
Thank you very much.
The answer I hate most, actually applies here: "it depends". Specifically, it depends on several things:
Who is the logging information for? Is it intended for business users (i.e., are there actual business requirements), is the information oriented at application management, do you need insight in frequently used features, etc.
What is the granularity of the logged info? For instance: do you only need to know if the search function was used, do you want to know the search query or do you also need info on the actual search results?
How accurate & complete does the info have to be? Audit trail requirements are usually very tight, technical ones often less so.
Do you want to be able to roll back the actions/activities? And if so, who is going to do that (business user, support personnel)
What does your deployment look like? If you have a single server, logging to text files or XML is more feasible than if you have a farm/load balanced environment.
For application logging, look at well-known providers such as log4net or the Enterprise Library logging application block; both allow you to configure where you want to log to (text file, database, etc).
For logging database actions, I suggest a solution in the database. Several versions of SQL Server 2008 have built-in support for auditing, Oracle has had this for years IIANM.
PostSharp probably. Log to a DB.
-- Edit:
This is for logging all code actions. To log all DB actions, I'd use triggers.
I would use :log4net
because you can configure and change the output(file, mail, DB, ...) in the config file so you do not have to rebuild your code.
If you don't need a complex auditing system, but just logging what your code is doing, I'll recommend you to use the tracing system integrated with .NET Framework and ASP.NET.
Using very simple classes in the framework, your code emit traces, and then, via configuration files you can send them to different storing systems (file, database, windows events, ...). You can even create your own store system trace provider
http://msdn.microsoft.com/en-us/library/system.diagnostics.trace.aspx