I was trying to explain to someone why database connections implement IDisposable, when I realized I don't really know what "opening a connection" actually mean.
So my question is - What does c# practically do when it opens a connection?
Thank you.
There are actually two classes involved in implementing a connection (actually more, but I'm simplifying).
One of these is the IDbConnection implementation (SQLConnection, NpgsqlConnection, OracleConnection, etc.) that you use in your code. The other is a "real" connection object that is internal to the assembly, and not visible to your code. We'll call this "RealConnection" for now, though its actual name differs with different implementations (e.g. in Npgsql, which is the case where I'm most familiar with the implementation, the class is called NpgsqlConnector).
When you create your IDbConnection, it does not have a RealConnection. Any attempt to do something with the database will fail. When you Open() it then the following happens:
If pooling is enabled, and there is a RealConnection in the pool, deque it and make it the RealConnection for the IDbConnection.
If pooling is enabled, and the total number of RealConnection objects in existence is larger than the maximum size, throw an exception.
Otherwise create a new RealConnection. Initialise it, which will involve opening some sort of network connection (e.g. TCP/IP) or file handle (for something like Access), go through the database's protocol for hand-shaking (varies with database type) and authorise the connection. This then becomes the RealConnection for the IDbConnection.
Operations carried out on the IDbConnection are turned into operations the RealConnection does on its network connection (or whatever). The results are turned into objects implementing IDataReader and so on so as to give a consistent interface for your programming.
If a IDataReader was created with CommandBehavior.CloseConnection, then that datareader obtains "ownership" of the RealConnection.
When you call Close() then one of the following happens:
If pooling, and if the pool isn't full, then the object is put in the queue for use with later operations.
Otherwise the RealConnection will carry out any protocol-defined procedures for ending the connection (signalling to the database that the connection is going to shut down) and closes the network connection etc. The object can then fall out of scope and become available for garbage collection.
The exception would be if the CommandBehavior.CloseConnection case happened, in which case it's Close() or Dispose() being called on the IDataReader that triggers this.
If you call Dispose() then the same thing happens as per Close(). The difference is that Dispose() is considered as "clean-up" and can work with using, while Close() might be used in the middle of lifetime, and followed by a later Open().
Because of the use of the RealConnection object and the fact that they are pooled, opening and closing connections changes from being something relatively heavy to relatively light. Hence rather than it being important to keep connections open for a long time to avoid the overhead of opening them, it becomes important to keep them open for as short a time as possible, since the RealConnection deals with the overhead for you, and the more rapidly you use them, the more efficiently the pooled connections get shared between uses.
Note also, that it's okay to Dispose() an IDbConnection that you have already called Close() on (it's a rule that it should always be safe to call Dispose(), whatever the state, indeed even if it was already called). Hence if you were manually calling Close() it would still be good to have the connection in a using block, to catch cases where exceptions happen before the call to Close(). The only exception is where you actually want the connection to stay open; say you were returning an IDataReader created with CommandBehavior.CloseConnection, in which case you don't dispose the IDbConnection, but do dispose the reader.
Should you fail to dispose the connection, then the RealConnection will not be returned to the pool for reuse, or go through its shut-down procedure. Either the pool will reach its limit, or the number of underlying connections will increase to the point of damaging performance and blocking more from being created. Eventually the finaliser on RealConnection may be called and lead to this being fixed, but finalisation only reduces the damage and can't be depended upon. (The IDbConnection doesn't need a finaliser, as it's the RealConnection that holds the unmanaged resource and/or needs to do the shut-down).
It's also reasonable to assume that there is some other requirement for disposal unique to the implementation of the IDbConnection beyond this, and it should still be disposed of even if analysing the above leads you to believe its not necessary (the exception is when CommandBehavior.CloseConnection passes all disposal burden to the IDataReader, but then it is just as important to dispose that reader).
Good question.
From my (somewhat limited knowledge) of the "under-the-hood" working of a SQL Connection, many steps are involved, such as:
The Steps Under the Hood
Physical socket/pipe is opened (using given drivers, eg ODBC)
Handshake with SQL Server
Connection string/credentials negotiated
Transaction scoping
Not to mention connection pooling, i believe there is some kind of alogrithm involved (if the connection string matches one for an already existing pool, the connection is added to the pool, otherwise new one is created)
IDiposable
With regards to SQL Connections, we implement IDisposable so that when we call dispose (either via the using directive, or explicity), it places the connection back into the connection pool. This is in stark contrast with just the plain old sqlConnection.Close() - as all this does is close it temporarily, but reserves that connection for later use.
From my understanding, .Close() closes the connection to the database, whereas .Dispose() calls .Close(), and then releases unmanaged resources.
Those points in mind, at the very least it is good practice to implement IDisposable.
Adding to answers above... The key is that upon "opening the connection" resources may be allocated that will take more than standard garbage collection to recover, namely an open socket/pipe/IPC of somekind. The Dispose() method cleans these up.
Related
I have seen examples where someone is doing:
IDbConnection db = new MySqlConnection(conn);
var people = db.Query<People>("SELECT * FROM PEOPLE").ToList();
or is the above a bad practice and should all queries be put in using statements like so:
using (var db = new MySqlConnection(conn))
{
var people = db.Query<People>("SELECT * FROM PEOPLE").ToList();
}
As others have correctly noted, the best practice in general is to use using any time an object implements IDisposable and you know that the lifetime of the object is going to be short -- that is, not longer than the duration of the current method. Doing so ensures that scarce operating system resources are cleaned up in a timely manner. Even if an object's disposal is "backstopped" by its finalizer, you don't want to be in a situation where, say, you have a lock on a file or database or something that is not going to be released until the finalizer runs several billion nanoseconds from now.
However I would temper that advice by saying that there are a small number of types that implement IDisposable for reasons other than timely disposal of unmanaged resources. In some very specific cases you can safely skip the using. However, it is almost never wrong to use using, even when it is not strictly speaking necessary, so my advice to you is to err on the side of caution and over-use, rather than under-use using.
Dapper doesn't have a view on this; what you are using here is the database connection. If you have finished with the connection: you have finished with the connection. Basically, yes, you should probably be using using.
The main purpose use of using statments is to release unmanaged resources.When an object is no longer used The garbage collector automatically releases the memory allocated to it but sometimes the garbage collector does not release resources such as files, streams or db connection like in your example.
Think of it of a way to explicitly dispose objects rather than leave it up to the compiler so you can say it's better practice.
In my experience with Sql Server and Oracle (using the ODP.Net drivers and the MS drivers), you need to use using around Connections, Commands and Transactions or you will soon exhaust your connection pool if you do anything but the simplest database interaction in a production environment (the connection pool is typically ~50 - 200 connections).
You may not notice the behaviour in a development environment because debug = a lot of restarts, which clears the pool.
As Eric Lippert above said, it is good practice in general to use using around IDisposable objects.
In practice, you can skip "using" on Parameters for SqlServer and Oracle.
I'm making a program which sends data to many TCP listeners. If one of the TCP Sending channel is no more used, do we need to close it using TCPClient.close. What happens if we leave it open?
Thanks!
I wouldn't use Close explicitly - I'd just dispose it via a using statement:
using (TcpClient client = ...)
{
// Use the client
}
What happens if we leave it open?
If neither side closes the connection, it'll be sitting there doing nothing, pointlessly.
If the other end closes the connection, I suspect you'll have a local connection in TIME_WAIT or something similar for a while; I don't know the exact details but it's not ideal.
The main point is that it may use up some system resources, and fundamentally it's not ideal. Is it going to cause your program to crash and your system to go crazy? Probably not - at least unless you're creating a huge number of these. It's still a good idea to dispose of anything with non-memory resources though.
If you leave it open, the GC (Garbage Collector) will in time dispose of your object.
With TCPClient, the finalizer (the code run when the GC cleans up your object) has a call to Dispose(), which in turn closes the connection and frees any OS resources used by it.
You don't know WHEN the GC does it's cleaning up, so this approach is undeterministic. This probably isn't a problem with one or two open connections, but you may starve the system of resources if you open a huge number of these.
A good practice would be to always dispose TCPClients when you're done with them, and this can easily be accomplished with the using block. (As shown in Jon Skeets answer...)
I would like to know if there are any pitfalls on calling a static method from an ASP.NET web service.
internal static object SelectScalar(String commandText, DataBaseEnum dataBase)
{
SqlConnection sqlc = new SqlConnection(AuthDbConnection.GetDatabaseConnectionString());
object returnval=null;
if (sqlc!=null)
{
SqlCommand sqlcmd = sqlc.CreateCommand();
sqlcmd.CommandText = commandText;
sqlc.Open();
returnval = sqlcmd.ExecuteScalar();
}
return returnval;
}
So for an example the method above; are there any pitfalls on multiple web methods and multiple clients calling this method at the same time (say, 1000 calls to a web method that calls this function)?
Since you're creating a new SqlConnection, you want to dispose it or the connection won't close. See MSDN for usage guidelines.
The fact that its a static method though... that doesn't seem to be an issue since you're not updating any shared state (global variables).
EDIT: AFAIK, the 'pitfalls' of static methods in webservices are the same as in any other application. The only thing to keep a note of is that a webservice is a server that is expected to run reliably for a long period of time. Thus things that could cause problems over time (memory leaks, db connection exhaustion, etc) are more significant than for other applications that run for a much shorter time period.
I don't know if C# is like Java, but opening a SQL connection and failing to close it before leaving the method doesn't seem like a good idea for me. The GC will clean it up once it goes out of scope, but that's not the same thing as closing the connection in Java.
The idiom in Java would demand that you close the connection in a finally block. Unless you're certain that the C# class doesn't require such a thing, I'd look into it.
You'll find out soon enough - thousands of web calls will exhaust the number of connections available quickly if they're scarce.
One more thing to check: opening connections this way is expensive in Java, so they're usually pooled. Is connection pooling also done in C#? Is it inefficient to keep opening and closing a database connection? Could you accomplish the same thing with a static, shared connection? If you go that way, perhaps threading issues come into play.
The thing to be aware about is when static members change state that is accessible to other threads in the application domain. In these cases, you have to take the proper measures to make it happen in an orderly way.
Your method does not touch any state out of itself (everything is local) so you're OK.
As duffymo and nader noted, you should dispose of your connection as you should dipose of any object that implements IDisposable.
Major Edit:
I misread the article! The comment was in regards to the finalize method of the the class not the finally block :). Apologies.
I was just reading that you should not close or dispose a database connection within a finally block but the article did not explain why. I can not seem to find a clear explanation as to why you would not want to do this.
Here is the article
If you look around, closing the connection in the finally block is one of the recommended ways of doing it. The article you were looking at probably recommended having a 'using' statement around the code that used the connection.
using (SqlConnection connection = new SqlConnection(connectionString))
{
SqlCommand command = connection.CreateCommand();
command.CommandText = "select * from someTable";
// Execute the query here...put it in a datatable/dataset
}
The 'using' statement will ensure the Connection object gets disposed immediately after it's needed rather than waiting for the Garbage Collector to dispose of it.
I have to disagree that you should not close or dispose of a database connection within a finally block.
Letting an unhandled (or even handled for that matter) exception leave open connections can take down a database pretty quickly if it has a lot of activity.
Closing a database connection is the defacto example of why to use the finally statement, IMHO. Of course, the using statement is my preferred method, which is maybe what the original author was going for.
Edit to the Major Edit:
That makes sense now. You wouldn't want to leave closing your database connection up to the garbage collector.
Without the original article, I can't speak for the author. However, depending on how you've implemented instantiating and opening the connection in relation to your try/catch/finally block, you might need to do some additional checking before just calling close. Ex, ensure the connection is not null and not already closed.
EDIT: The article says not to dispose of a connection object in your Finalize method, not to not close it in the finally block. In fact, in the paragraph above it says you should always be closing your connection after you use it, so it is returned to the connection pool.
"CAUTION It is recommended that you always close the Connection when you are finished using it in order for the connection to be returned to the pool. This can be done using either the Close or Dispose methods of the Connection object. Connections that are not explicitly closed might not be added or returned to the pool. For example, a connection that has gone out of scope but that has not been explicitly closed will only be returned to the connection pool if the maximum pool size has been reached and the connection is still valid.
Note Do not call Close or Dispose on a Connection, a DataReader, or any other managed object in the Finalize method of your class. In a finalizer, only release unmanaged resources that your class owns directly. If your class does not own any unmanaged resources, do not include a Finalize method in your class definition. For more information, "
http://msdn.microsoft.com/en-us/library/8xx3tyca(VS.71).aspx?ppud=4
A little bit of Googling turns up quite a few pages that hold the opposite opinion. Using a "finally" block seems like a good way to ensure that the connection is always closed correctly, although as others have said I'd be interested to see the original article that said it wasn't a good idea.
From what I can see in the article it advices against calling Dispose or Close in the Finalizer of a class, not against doing so in a finally block, which is quite a different thing.
The Close method puts the connection object into a state from which it can be re-opened. The Dispose method puts it into a state from which it cannot be re-opened (closing it first if currently open).
If you instantiate a connection, open it, use it, then throw it away (the normal usage pattern), then a using block is the best and simplest way to do it.
Obviously, if you are doing something more complex with multiple Open and Close calls, then disposing it will throw a spanner in the works.
VS 2010 Beta Code Analysis is advising me to always dispose WCF Service Clients and Data Contexts for LINQ to SQL. Is this advisable? I assume this will eventually happen regardless. What are the pros/cons of doing so implicitly or not at all?
With anything that is IDisposable, it is your job to ensure it gets cleaned up promptly, to close open resources etc (connections, etc). So in short, yes. Finalizers only happen when GC kicks in, which can be a long time - and doesn't allow for the best pooling use etc.
Note however that WCF has a history of dodgy Dispose() implementations, making just using a bit hit-n-miss (you can lose the actual exception). Of course, it is also recommended to use WCF via async methods, which again makes it hard to use using; you'll often need to keep a reference and call Dispose() explicitely in the IO callback.
Re your comments; a few examples:
SqlConnection: if you leave this hanging around (open) you prevent it reusing the connection from a pool; you can ultimately saturate the available connections this way, breaking your app
GDI: IIRC, VS2005 itself (or 2003?) had a bug where an undisposed GDI object; GDI objects don't take much memory, so it didn't trigger GC, but it ran out of available handles and broke (this is a common bug in user code too, but more famous in an IDE)
fail to dispose something like a FileStream: you've blocked your file system
fail to dispose a TransactionScope and you've caused db (etc) blocking