Related
I am very new to working with databases. Now I can write SELECT, UPDATE, DELETE, and INSERT commands. But I have seen many forums where we prefer to write:
SELECT empSalary from employee where salary = #salary
...instead of:
SELECT empSalary from employee where salary = txtSalary.Text
Why do we always prefer to use parameters and how would I use them?
I wanted to know the use and benefits of the first method. I have even heard of SQL injection but I don't fully understand it. I don't even know if SQL injection is related to my question.
Using parameters helps prevent SQL Injection attacks when the database is used in conjunction with a program interface such as a desktop program or web site.
In your example, a user can directly run SQL code on your database by crafting statements in txtSalary.
For example, if they were to write 0 OR 1=1, the executed SQL would be
SELECT empSalary from employee where salary = 0 or 1=1
whereby all empSalaries would be returned.
Further, a user could perform far worse commands against your database, including deleting it If they wrote 0; Drop Table employee:
SELECT empSalary from employee where salary = 0; Drop Table employee
The table employee would then be deleted.
In your case, it looks like you're using .NET. Using parameters is as easy as:
string sql = "SELECT empSalary from employee where salary = #salary";
using (SqlConnection connection = new SqlConnection(/* connection info */))
using (SqlCommand command = new SqlCommand(sql, connection))
{
var salaryParam = new SqlParameter("salary", SqlDbType.Money);
salaryParam.Value = txtMoney.Text;
command.Parameters.Add(salaryParam);
var results = command.ExecuteReader();
}
Dim sql As String = "SELECT empSalary from employee where salary = #salary"
Using connection As New SqlConnection("connectionString")
Using command As New SqlCommand(sql, connection)
Dim salaryParam = New SqlParameter("salary", SqlDbType.Money)
salaryParam.Value = txtMoney.Text
command.Parameters.Add(salaryParam)
Dim results = command.ExecuteReader()
End Using
End Using
Edit 2016-4-25:
As per George Stocker's comment, I changed the sample code to not use AddWithValue. Also, it is generally recommended that you wrap IDisposables in using statements.
You are right, this is related to SQL injection, which is a vulnerability that allows a malicioius user to execute arbitrary statements against your database. This old time favorite XKCD comic illustrates the concept:
In your example, if you just use:
var query = "SELECT empSalary from employee where salary = " + txtSalary.Text;
// and proceed to execute this query
You are open to SQL injection. For example, say someone enters txtSalary:
1; UPDATE employee SET salary = 9999999 WHERE empID = 10; --
1; DROP TABLE employee; --
// etc.
When you execute this query, it will perform a SELECT and an UPDATE or DROP, or whatever they wanted. The -- at the end simply comments out the rest of your query, which would be useful in the attack if you were concatenating anything after txtSalary.Text.
The correct way is to use parameterized queries, eg (C#):
SqlCommand query = new SqlCommand("SELECT empSalary FROM employee
WHERE salary = #sal;");
query.Parameters.AddWithValue("#sal", txtSalary.Text);
With that, you can safely execute the query.
For reference on how to avoid SQL injection in several other languages, check bobby-tables.com, a website maintained by a SO user.
In addition to other answers need to add that parameters not only helps prevent sql injection but can improve performance of queries. Sql server caching parameterized query plans and reuse them on repeated queries execution. If you not parameterized your query then sql server would compile new plan on each query(with some exclusion) execution if text of query would differ.
More information about query plan caching
Two years after my first go, I'm recidivating...
Why do we prefer parameters? SQL injection is obviously a big reason, but could it be that we're secretly longing to get back to SQL as a language. SQL in string literals is already a weird cultural practice, but at least you can copy and paste your request into management studio. SQL dynamically constructed with host language conditionals and control structures, when SQL has conditionals and control structures, is just level 0 barbarism. You have to run your app in debug, or with a trace, to see what SQL it generates.
Don't stop with just parameters. Go all the way and use QueryFirst (disclaimer: which I wrote). Your SQL lives in a .sql file. You edit it in the fabulous TSQL editor window, with syntax validation and Intellisense for your tables and columns. You can assign test data in the special comments section and click "play" to run your query right there in the window. Creating a parameter is as easy as putting "#myParam" in your SQL. Then, each time you save, QueryFirst generates the C# wrapper for your query. Your parameters pop up, strongly typed, as arguments to the Execute() methods. Your results are returned in an IEnumerable or List of strongly typed POCOs, the types generated from the actual schema returned by your query. If your query doesn't run, your app won't compile. If your db schema changes and your query runs but some columns disappear, the compile error points to the line in your code that tries to access the missing data. And there are numerous other advantages. Why would you want to access data any other way?
In Sql when any word contain # sign it means it is variable and we use this variable to set value in it and use it on number area on the same sql script because it is only restricted on the single script while you can declare lot of variables of same type and name on many script. We use this variable in stored procedure lot because stored procedure are pre-compiled queries and we can pass values in these variable from script, desktop and websites for further information read Declare Local Variable, Sql Stored Procedure and sql injections.
Also read Protect from sql injection it will guide how you can protect your database.
Hope it help you to understand also any question comment me.
Old post but wanted to ensure newcomers are aware of Stored procedures.
My 10¢ worth here is that if you are able to write your SQL statement as a stored procedure, that in my view is the optimum approach. I ALWAYS use stored procs and never loop through records in my main code. For Example: SQL Table > SQL Stored Procedures > IIS/Dot.NET > Class.
When you use stored procedures, you can restrict the user to EXECUTE permission only, thus reducing security risks.
Your stored procedure is inherently paramerised, and you can specify input and output parameters.
The stored procedure (if it returns data via SELECT statement) can be accessed and read in the exact same way as you would a regular SELECT statement in your code.
It also runs faster as it is compiled on the SQL Server.
Did I also mention you can do multiple steps, e.g. update a table, check values on another DB server, and then once finally finished, return data to the client, all on the same server, and no interaction with the client. So this is MUCH faster than coding this logic in your code.
Other answers cover why parameters are important, but there is a downside! In .net, there are several methods for creating parameters (Add, AddWithValue), but they all require you to worry, needlessly, about the parameter name, and they all reduce the readability of the SQL in the code. Right when you're trying to meditate on the SQL, you need to hunt around above or below to see what value has been used in the parameter.
I humbly claim my little SqlBuilder class is the most elegant way to write parameterized queries. Your code will look like this...
C#
var bldr = new SqlBuilder( myCommand );
bldr.Append("SELECT * FROM CUSTOMERS WHERE ID = ").Value(myId);
//or
bldr.Append("SELECT * FROM CUSTOMERS WHERE NAME LIKE ").FuzzyValue(myName);
myCommand.CommandText = bldr.ToString();
Your code will be shorter and much more readable. You don't even need extra lines, and, when you're reading back, you don't need to hunt around for the value of parameters. The class you need is here...
using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
public class SqlBuilder
{
private StringBuilder _rq;
private SqlCommand _cmd;
private int _seq;
public SqlBuilder(SqlCommand cmd)
{
_rq = new StringBuilder();
_cmd = cmd;
_seq = 0;
}
public SqlBuilder Append(String str)
{
_rq.Append(str);
return this;
}
public SqlBuilder Value(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append(paramName);
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public SqlBuilder FuzzyValue(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append("'%' + " + paramName + " + '%'");
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public override string ToString()
{
return _rq.ToString();
}
}
I am very new to working with databases. Now I can write SELECT, UPDATE, DELETE, and INSERT commands. But I have seen many forums where we prefer to write:
SELECT empSalary from employee where salary = #salary
...instead of:
SELECT empSalary from employee where salary = txtSalary.Text
Why do we always prefer to use parameters and how would I use them?
I wanted to know the use and benefits of the first method. I have even heard of SQL injection but I don't fully understand it. I don't even know if SQL injection is related to my question.
Using parameters helps prevent SQL Injection attacks when the database is used in conjunction with a program interface such as a desktop program or web site.
In your example, a user can directly run SQL code on your database by crafting statements in txtSalary.
For example, if they were to write 0 OR 1=1, the executed SQL would be
SELECT empSalary from employee where salary = 0 or 1=1
whereby all empSalaries would be returned.
Further, a user could perform far worse commands against your database, including deleting it If they wrote 0; Drop Table employee:
SELECT empSalary from employee where salary = 0; Drop Table employee
The table employee would then be deleted.
In your case, it looks like you're using .NET. Using parameters is as easy as:
string sql = "SELECT empSalary from employee where salary = #salary";
using (SqlConnection connection = new SqlConnection(/* connection info */))
using (SqlCommand command = new SqlCommand(sql, connection))
{
var salaryParam = new SqlParameter("salary", SqlDbType.Money);
salaryParam.Value = txtMoney.Text;
command.Parameters.Add(salaryParam);
var results = command.ExecuteReader();
}
Dim sql As String = "SELECT empSalary from employee where salary = #salary"
Using connection As New SqlConnection("connectionString")
Using command As New SqlCommand(sql, connection)
Dim salaryParam = New SqlParameter("salary", SqlDbType.Money)
salaryParam.Value = txtMoney.Text
command.Parameters.Add(salaryParam)
Dim results = command.ExecuteReader()
End Using
End Using
Edit 2016-4-25:
As per George Stocker's comment, I changed the sample code to not use AddWithValue. Also, it is generally recommended that you wrap IDisposables in using statements.
You are right, this is related to SQL injection, which is a vulnerability that allows a malicioius user to execute arbitrary statements against your database. This old time favorite XKCD comic illustrates the concept:
In your example, if you just use:
var query = "SELECT empSalary from employee where salary = " + txtSalary.Text;
// and proceed to execute this query
You are open to SQL injection. For example, say someone enters txtSalary:
1; UPDATE employee SET salary = 9999999 WHERE empID = 10; --
1; DROP TABLE employee; --
// etc.
When you execute this query, it will perform a SELECT and an UPDATE or DROP, or whatever they wanted. The -- at the end simply comments out the rest of your query, which would be useful in the attack if you were concatenating anything after txtSalary.Text.
The correct way is to use parameterized queries, eg (C#):
SqlCommand query = new SqlCommand("SELECT empSalary FROM employee
WHERE salary = #sal;");
query.Parameters.AddWithValue("#sal", txtSalary.Text);
With that, you can safely execute the query.
For reference on how to avoid SQL injection in several other languages, check bobby-tables.com, a website maintained by a SO user.
In addition to other answers need to add that parameters not only helps prevent sql injection but can improve performance of queries. Sql server caching parameterized query plans and reuse them on repeated queries execution. If you not parameterized your query then sql server would compile new plan on each query(with some exclusion) execution if text of query would differ.
More information about query plan caching
Two years after my first go, I'm recidivating...
Why do we prefer parameters? SQL injection is obviously a big reason, but could it be that we're secretly longing to get back to SQL as a language. SQL in string literals is already a weird cultural practice, but at least you can copy and paste your request into management studio. SQL dynamically constructed with host language conditionals and control structures, when SQL has conditionals and control structures, is just level 0 barbarism. You have to run your app in debug, or with a trace, to see what SQL it generates.
Don't stop with just parameters. Go all the way and use QueryFirst (disclaimer: which I wrote). Your SQL lives in a .sql file. You edit it in the fabulous TSQL editor window, with syntax validation and Intellisense for your tables and columns. You can assign test data in the special comments section and click "play" to run your query right there in the window. Creating a parameter is as easy as putting "#myParam" in your SQL. Then, each time you save, QueryFirst generates the C# wrapper for your query. Your parameters pop up, strongly typed, as arguments to the Execute() methods. Your results are returned in an IEnumerable or List of strongly typed POCOs, the types generated from the actual schema returned by your query. If your query doesn't run, your app won't compile. If your db schema changes and your query runs but some columns disappear, the compile error points to the line in your code that tries to access the missing data. And there are numerous other advantages. Why would you want to access data any other way?
In Sql when any word contain # sign it means it is variable and we use this variable to set value in it and use it on number area on the same sql script because it is only restricted on the single script while you can declare lot of variables of same type and name on many script. We use this variable in stored procedure lot because stored procedure are pre-compiled queries and we can pass values in these variable from script, desktop and websites for further information read Declare Local Variable, Sql Stored Procedure and sql injections.
Also read Protect from sql injection it will guide how you can protect your database.
Hope it help you to understand also any question comment me.
Old post but wanted to ensure newcomers are aware of Stored procedures.
My 10¢ worth here is that if you are able to write your SQL statement as a stored procedure, that in my view is the optimum approach. I ALWAYS use stored procs and never loop through records in my main code. For Example: SQL Table > SQL Stored Procedures > IIS/Dot.NET > Class.
When you use stored procedures, you can restrict the user to EXECUTE permission only, thus reducing security risks.
Your stored procedure is inherently paramerised, and you can specify input and output parameters.
The stored procedure (if it returns data via SELECT statement) can be accessed and read in the exact same way as you would a regular SELECT statement in your code.
It also runs faster as it is compiled on the SQL Server.
Did I also mention you can do multiple steps, e.g. update a table, check values on another DB server, and then once finally finished, return data to the client, all on the same server, and no interaction with the client. So this is MUCH faster than coding this logic in your code.
Other answers cover why parameters are important, but there is a downside! In .net, there are several methods for creating parameters (Add, AddWithValue), but they all require you to worry, needlessly, about the parameter name, and they all reduce the readability of the SQL in the code. Right when you're trying to meditate on the SQL, you need to hunt around above or below to see what value has been used in the parameter.
I humbly claim my little SqlBuilder class is the most elegant way to write parameterized queries. Your code will look like this...
C#
var bldr = new SqlBuilder( myCommand );
bldr.Append("SELECT * FROM CUSTOMERS WHERE ID = ").Value(myId);
//or
bldr.Append("SELECT * FROM CUSTOMERS WHERE NAME LIKE ").FuzzyValue(myName);
myCommand.CommandText = bldr.ToString();
Your code will be shorter and much more readable. You don't even need extra lines, and, when you're reading back, you don't need to hunt around for the value of parameters. The class you need is here...
using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
public class SqlBuilder
{
private StringBuilder _rq;
private SqlCommand _cmd;
private int _seq;
public SqlBuilder(SqlCommand cmd)
{
_rq = new StringBuilder();
_cmd = cmd;
_seq = 0;
}
public SqlBuilder Append(String str)
{
_rq.Append(str);
return this;
}
public SqlBuilder Value(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append(paramName);
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public SqlBuilder FuzzyValue(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append("'%' + " + paramName + " + '%'");
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public override string ToString()
{
return _rq.ToString();
}
}
I'm trying to set up so that the table name is passed to the command text as a parameter, but I'm not getting it to work. I've looked around a bit, and found questions like this: Parameterized Query for MySQL with C#, but I've not had any luck.
This is the relevant code (connection == the MySqlConnection containing the connection string):
public static DataSet getData(string table)
{
DataSet returnValue = new DataSet();
try
{
MySqlCommand cmd = connection.CreateCommand();
cmd.Parameters.AddWithValue("#param1", table);
cmd.CommandText = "SELECT * FROM #param1";
connection.Open();
MySqlDataAdapter adap = new MySqlDataAdapter(cmd);
adap.Fill(returnValue);
}
catch (Exception)
{
}
finally
{
if (connection.State == ConnectionState.Open)
connection.Close();
}
return returnValue;
}
If I change:
cmd.CommandText = "SELECT * FROM #param1";
to:
cmd.CommandText = "SELECT * FROM " + table;
As a way of testing, and that works (I'm writing the xml from the dataset to console to check). So I'm pretty sure the problem is just using the parameter functionality in the wrong way. Any pointers?
Also, correct me if I'm mistaken, but using the Parameter functionality should give complete protection against SQL injection, right?
You can not parameterize your table names, column names or any other databse objects. You can only parameterize your values.
You need to pass it as a string concatenation on your sql query but before you do that, I suggest use strong validation or white list (only fixed set of possible correct values).
Also, correct me if I'm mistaken, but using the Parameter
functionality should give complete protection against SQL injection,
right?
If you mean parameterized statements with "parameter functionality", yes, that's correct.
By the way, be aware, there is a concept called dynamic SQL supports SELECT * FROM #tablename but it is not recommended.
As we have seen, we can make this procedure work with help of dynamic
SQL, but it should also be clear that we gain none of the advantages
with generating that dynamic SQL in a stored procedure. You could just
as well send the dynamic SQL from the client. So, OK: 1) if the SQL
statement is very complex, you save some network traffic and you do
encapsulation. 2) As we have seen, starting with SQL 2005 there are
methods to deal with permissions. Nevertheless, this is a bad idea.
There seems to be several reasons why people want to parameterise the
table name. One camp appears to be people who are new to SQL
programming, but have experience from other languages such as C++, VB
etc where parameterisation is a good thing. Parameterising the table
name to achieve generic code and to increase maintainability seems
like good programmer virtue.
But it is just that when it comes to database objects, the old truth
does not hold. In a proper database design, each table is unique, as
it describes a unique entity. (Or at least it should!) Of course, it
is not uncommon to end up with a dozen or more look-up tables that all
have an id, a name column and some auditing columns. But they do
describe different entities, and their semblance should be regarded as
mere chance, and future requirements may make the tables more
dissimilar.
Using table's name as parameter is incorrect. Parameters in SQL just works for values not identifiers of columns or tables.
One option can be using SqlCommandBuilder Class, This will escape your table name and not vulnerable to SQL Injection:
SqlCommandBuilder cmdBuilder = new SqlCommandBuilder();
string tbName = cmdBuilder.QuoteIdentifier(tableName);
You can use the tbName in your statement because it's not vulnerable to SQL Injection now.
I'm working on a project where I wish to be able to send commands to SQL Server and also run Queries. I've succeeded in returning a query to a listbox using Joel's very good tutorial here:
creating a database query METHOD
I am now trying to adapt this to execute some commands and then run a query to check the commands worked. My query is failing because I think the commands did not work.
Currently I am sending this:
MySqlCommand("CREATE TABLE #CSVTest_Data" +
"(FirstTimeTaken DATETIME," +
"LatestTimeTaken DATETIME," +
"Market VARCHAR(50)," +
"Outcome VARCHAR(50),"+
"Odds DECIMAL(18,2)," +
"NumberOfBets INT," +
"VolumeMatched DECIMAL(18,2),"+
"InPlay TINYINT)");
Into this:
private void MySqlCommand(string sql)
{
int numberOfRecords;
//string result;
using (var connection = GetConnection())
using (var command = new SqlCommand(sql, connection))
{
connection.Open();
numberOfRecords = command.ExecuteNonQuery();
}
MessageBox.Show(numberOfRecords.ToString());
}
My understand is that ExecuteNonQuery returns an integer of the number of rows effected. My message box shows a value of -1. Running the same command in SQL Server returns 'Command(s) completed successfully.' I would appreciate if somebody could tell me whether my MySqlCommand method looks OK and how I might return the SQL Server message that is output by running the function.
In order to obtain messages that are output to the Messages tab in SQL Server Management Studio, "the console" when executing SQL statements on SQL Server, it is necessary to hook into the InfoMessage event on the SqlConnection class:
using (var connection = GetConnection())
using (var command = new SqlCommand(sql, connection))
{
connection.InfoMessage += (s, e) =>
{
Debug.WriteLine(e.Message);
};
connection.Open();
numberOfRecords = command.ExecuteNonQuery();
}
Obviously you will need to handle the event differently from what I showed above, and there are other properties on the e parameter here as well, see SqlInfoMessageEventArgs for details.
NOW having said that, bear in mind that some of the messages output to the message tab in SQL Server Management Studio is generated by that program, and does not originate from the server itself, so whether that particular message you're asking about would show up through that event I cannot say for sure.
Additionally, in this particular type of SQL Statement, the correct return value from ExecuteNonQuery is in fact -1, as is documented:
For UPDATE, INSERT, and DELETE statements, the return value is the number of rows affected by the command. When a trigger exists on a table being inserted or updated, the return value includes the number of rows affected by both the insert or update operation and the number of rows affected by the trigger or triggers. For all other types of statements, the return value is -1. If a rollback occurs, the return value is also -1.
(my emphasis)
Change
var numberOfRecords = command.ExecuteNonQuery();
to
var numberOfRecords = command.ExecuteScalar();
Also, please have a look at SqlCommand Methods
You Should use ExecuteScalar.
ExecuteScalar is typically used when your query returns one value.
ExecuteNonQuery is used for SQL statements like update,insert,create etc.
So change it to
numberOfRecords = (int)command.ExecuteNonQuery();
Here is a comment from MSDN on ExecuteNonQuery:
For UPDATE, INSERT, and DELETE statements, the return value is the
number of rows affected by the command. ... For all other types of
statements, the return value is -1.
Since you are executing neither UPDATE nor INSERT nor DELETE - you are receiving -1 even though operation is successful. Basically you can assume that if no SqlException was thrown - your CREATE statement worked.
numberOfRecords=(int) command.ExecuteScalar();
I am converting data from a CSV file into a database. I put the data from the CSV file into a DataTable and am trying to validate the data.
One thing I want to check is that all of the values in a certain column of the DataTable (let's call it PersonID) are found in the columns of a table in the database I'm converting to (let's call that PeopleID).
So, I want to check if all of the values of PersonID are listed in the PeopleId table.
I have the results of the DataTable as follows:
var listOfPersonIdsInData = arguments.DataTable.Select("PersonId");
And I query the database to get the values of the PeopleId column:
var listOfPeopleIdsInDatabase = checkQuery.Execute<DataColumn>(#"SELECT DISTINCT PeopleId FROM People");`
What would be the best way to go about checking this in C#? I realize it's a somewhat basic question but the way I'm thinking of doing it is using two arrays. Read in the results of each into an array, then cycle through each value of array 1 to check if it's in array 2.
I feel like I'm re-inventing the wheel. I would really like to know a better way if there is one. If anyone could provide any advice I'd greatly appreciate it.
If you're using SQL 2008 I would recommend that you just pass the DataTable as parameter as a Table-Valued Parameter to a stored procedure or a Parameterized query and then use an Anti Join or Not In or Not Exists to determine if there are any rows in the DataTable that aren't in the SQL Table.
e.g.
Create the type
CREATE TYPE dbo.PersonTable AS TABLE
( PersonId int )
Then the proc
CREATE PROCEDURE usp_ValidateDataTable
(#CheckTable dbo.PersonTable READONLY) as
BEGIN
SELECT c.PersonID
FROM
#CheckTable c
WHERE
c.Person NOT IN (SELECT PersonID from dbo.People)
END
C# Code
SP Call
SqlCommand cmd= new SqlCommand("usp_ValidateDataTable" , cnn);
SqlParameter tvpParam = cmd.Parameters.AddWithValue("#CheckTable", listOfPersonIdsInData );
tvpParam.SqlDbType = SqlDbType.Structured;
tvpParam.TypeName = "dbo.PersonTable";
SqlDataReader rdr = cmd.ExcuteReader();
C# Code
Parameterized Query Call
string query = #" SELECT c.PersonID
FROM #CheckTable c
WHERE c.Person NOT IN (SELECT PersonID from dbo.People)";
SqlCommand cmd= new SqlCommand(query , cnn);
SqlParameter tvpParam = cmd.Parameters.AddWithValue("#CheckTable", listOfPersonIdsInData );
tvpParam.SqlDbType = SqlDbType.Structured;
tvpParam.TypeName = "dbo.PersonTable";
SqlDataReader rdr = cmd.ExcuteReader();
I have had to migrate much information and so far I think the best is:
Create a flat table with the information from the CSV and load all the data there
Create in the same SQL methods to extract standardized information
Construct a method in the same SQL crossing normalized information with the raw data
is really fast especially when the number of records is quite large (greater than 1M), plus you avoid the problem of optimizing your RAM management script/program. also load CSV to MySQL data is really easy check this
a tip: parameterized method for import and verify with an offset and limit value