Permanent connection to cassandra or connection per request? - c#

I'm right now implementing my own UserStore for WebApi 2 so it works with cassandra. My question is, if i should close the connection after a request and reconnect to cassandra for the next request.
Right now i am establishing the connection at startup of the application and passing a cassandra Context to the UserStore to work with. the connection is closed when i shut down the application.
I'm wondering if e.g. 10 people register at the same, is this possible with only one connection?
static Startup()
{
PublicClientId = "self";
//Connecting to Cassandra
Cluster cluster = Cluster.Builder().AddContactPoint("127.0.0.1").Build();
Session session = cluster.Connect();
Context context = new Context(session);
//passing context
UserManagerFactory = () => new UserManager<ApplicationUser>(new UserStore<ApplicationUser>(context));
OAuthOptions = new OAuthAuthorizationServerOptions
{
TokenEndpointPath = new PathString("/Token"),
Provider = new ApplicationOAuthProvider(PublicClientId, UserManagerFactory),
AuthorizeEndpointPath = new PathString("/api/Account/ExternalLogin"),
AccessTokenExpireTimeSpan = TimeSpan.FromDays(14),
AllowInsecureHttp = true
};
}
//UserStore Method for Registration
public virtual Task CreateAsync(TUser user)
{
if (user == null)
{
throw new ArgumentNullException("user");
}
var usertable = _context.GetTable<TUser>("users");
//insert user into table
}

When you are deciding if you should have connection pooling or not, there is typically 2 questions you need to answer:
What is the cost of establishing connection every time?
Establishing connection on some networks like EC2 is more expensive depending on the type of network and machines and your architecture setup. It can take several milliseconds and that adds up to your query time if you are establishing connections every time. If you care about saving those milliseconds, then pooling connections is a better option for you.
Connections to databases are resources managed by your OS that your Application Server and DB server are holding onto while being consumed or sleeping. If your hardware resources are low, the connections should be treated like files. You open them, read from, or write to them, then close them. If you don't have hardware resource constraints then don't worry about pooling resource.
On the Cassandra side if you set your rpc_server_type to sync, then each connection will have its own thread which takes minimum of 180K and if you have a lot of clients, then memory will be your limiting factor. If you chose hsha as rpc_server_type, then this won't be a problem.
Read this: https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L362
About your application performance:
Ideally if you are using Cassandra with a multi-node setup, you want to distribute your requests to multiple nodes (coordinators) so that it scales better. Because of this it is better if you don't stick with the same connection as you will always be talking to the same co-ordinator.
Secondly, if you are multi-threading, you want to make sure your connection is sticking with the same thread while being used for the duration of each query to Cassandra, otherwise, you may end up with race conditions where one thread is updating resources of the connection which is already being used (e.g. trying to sent query packets to server, when it previously was waiting for a response from server).
I recommend implementing a thread-safe connection pool and open several connections on startup of your application and randomly use them for each of your requests, and kill them when your application server stops. Make sure you consider changing the rpc_server_type in Cassandra if you have hardware constraints.

The DataStax C# driver already provides connection pooling internally, thus the recommended way to use the driver is to use a Cluster instance per your C* cluster and a Session per keyspace.Basically initialize these at the startup of your app (nb you can also prepare the PreparedStatements too), reuse these across request, and close them for cleanup when stopping the app (for upgrading, etc).
I'd strongly recommend you to give a quick read of the C# driver docs. It shouldn't take you long and you'll know more about what's included in the driver.

Related

Proper way to deal with database connectivity issue

I getting below error on trying to connect with the database :
A network-related or instance-specific error occurred while
establishing a connection to SQL Server. The server was not found or
was not accessible. Verify that the instance name is correct and that
SQL Server is configured to allow remote connections. (provider: Named
Pipes Provider, error: 40 - Could not open a connection to SQL Server)
Now sometimes i get this error and sometimes i dont so for eg:When i run my program for the first time,it open connection successfully and when i run for the second time i get this error and the next moment when i run my program again then i dont get error.
When i try to connect to same database server through SSMS then i am able to connect successfully but i am getting this network issue in my program only.
Database is not in my LOCAL.Its on AZURE.
I dont get this error with my local database.
Code :
public class AddOperation
{
public void Start()
{
using (var processor = new MyProcessor())
{
for (int i = 0; i < 2; i++)
{
if(i==0)
{
var connection = new SqlConnection("Connection string 1");
processor.Process(connection);
}
else
{
var connection = new SqlConnection("Connection string 2");
processor.Process(connection);
}
}
}
}
}
public class MyProcessor : IDisposable
{
public void Process(DbConnection cn)
{
using (var cmd = cn.CreateCommand())
{
cmd.CommandText = "query";
cmd.CommandTimeout = 1800;
cn.Open();//Sometimes work sometimes dont
using (var reader = cmd.ExecuteReader(CommandBehavior.CloseConnection))
{
//code
}
}
}
}
So i am confused with 2 things :
1) ConnectionTimeout : Whether i should increase connectiontimeout and will this solve my unusual connection problem ?
2) Retry Attempt Policy : Should i implement retry connection mechanism like below :
public static void OpenConnection(DbConnection cn, int maxAttempts = 1)
{
int attempts = 0;
while (true)
{
try
{
cn.Open();
return;
}
catch
{
attempts++;
if (attempts >= maxAttempts) throw;
}
}
}
I am confused with this 2 above options.
Can anybody please suggest me what would be the better way to deal with this problem?
Use a new version of .NET (4.6.1 or later) and then take advantage of the built-in resiliency features:
ConnectRetryCount, ConnectRetryInterval and Connection Timeout.
See the for more info: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-connectivity-issues#net-sqlconnection-parameters-for-connection-retry
All applications that communicate with remote service are sensitive to transient faults.
As mentioned in other answers, if your client program connects to SQL Database by using the .NET Framework class System.Data.SqlClient.SqlConnection, use .NET 4.6.1 or later (or .NET Core) so that you can use its connection retry feature.
When you build the connection string for your SqlConnection object, coordinate the values among the following parameters:
ConnectRetryCount:  Default is 1. Range is 0 through 255.
ConnectRetryInterval:  Default is 1 second. Range is 1 through 60.
Connection Timeout:  Default is 15 seconds. Range is 0 through 2147483647.
Specifically, your chosen values should make the following equality true:
Connection Timeout = ConnectRetryCount * ConnectionRetryInterval
Now, Coming to option 2, when you app has custom retry logic, it will increase total retry times - for each custom retry it will try for ConnectRetryCount times. e.g. if ConnectRetryCount = 3 and custom retry = 5, it will attempt 15 tries. You might not need that many retries.
If you only consider custom retry vs Connection Timeout:
Connection Timeout occurs usually due to lossy network - network with higher packet losses (e.g. cellular or weak WiFi) or high traffic load. It's up to you choose best strategy of using among them.
Below guidelines would be helpful to troubleshoot transient errors:
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-connectivity-issues
https://learn.microsoft.com/en-in/azure/architecture/best-practices/transient-faults
As you can read here a retry logic is recommended even for a SQL Server installed on an Azure VM (IaaS).
FAULT HANDLING: Your application code includes retry logic and
transient fault handling? Including proper retry logic and transient
fault handling remediation in the code should be a universal best
practice, both on-premises and in the cloud, either IaaS or PaaS. If
this characteristic is missing, application problems may raise on both
Azure SQLDB and SQL Server in Azure VM, but in this scenario the
latter is recommended over the former.
An incremental retry logic is recommended.
There are two basic approaches to instantiating the objects from the application block that your application requires. In the first approach, you can explicitly instantiate all the objects in code, as shown in the following code snippet:
var retryStrategy = new Incremental(5, TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2));
var retryPolicy =
new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(retryStrategy);
In the second approach, you can instantiate and configure the objects from configuration data as shown in the following code snippet:
// Load policies from the configuration file.
// SystemConfigurationSource is defined in
// Microsoft.Practices.EnterpriseLibrary.Common.
using (var config = new SystemConfigurationSource())
{
var settings = RetryPolicyConfigurationSettings.GetRetryPolicySettings(config);
// Initialize the RetryPolicyFactory with a RetryManager built from the
// settings in the configuration file.
RetryPolicyFactory.SetRetryManager(settings.BuildRetryManager());
var retryPolicy = RetryPolicyFactory.GetRetryPolicy
<SqlDatabaseTransientErrorDetectionStrategy>("Incremental Retry Strategy");
...
// Use the policy to handle the retries of an operation.
}
For more information, please visit this documentation.
Consider using Polly.
You could use a simple piece of code like -
RetryPolicy retryPolicy = Policy.Handle<Exception>()
.WaitAndRetry(3, retryAttempt =>
TimeSpan.FromSeconds(retryAttempt));
var result = retryPolicy.Execute(() => someClass.DoSomething());
This will retry the request up to three times.
It is completely possible that a connection can drop. "Fallacies of Distributed Computing" :).
It could be network connectivity issue. Could be at any end.
I would recommend: (assuming firewall is enabled for your machine on Azure)
Ping the server and see if there is any loss.
ping (server).database.windows.net
tracert
telnet can also be your friend.
The above three should help you to pin-point where the problem is.
I think your retry logic is fine.
Regarding you question
Increase Timeout
Only if you are sure that your query will take long time. If for a simple insert you have to increase timeout problem could be network connectivity.
Retry Logic
As already posted, it's now part of framework which you can utilise or the one you created should be fine. Ideally, it's good to have retry logic, even if you are sure about connectivity and speed. Just in case :)
You should increase the timeout because the time taken to establish a connection to a SQL server has many steps, hence it takes some time when it goes for establishing the connection for the first time. After establishment of the connection, the connection is pooled in the memory for re-use in subsequent queries.
Please refer below link for more detailed understanding on connection-pooling:
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling
As you mentioned that this error generates sometimes, and not always, so there might be some network and connectivity factors for that. The default timeout for SQL connection is 15 seconds. I think if you change it to 30 seconds, it should work.
Are you using SQL Express or Workgroup Edition? If so, it's possible that the server is too busy to respond.
To rule out network problems, from a command prompt, do a PING -t SqlServername. Does every ping come back, or are some lost? This can be an indicator of network interruptions that might also cause this error, like a faulty switch. If they are all lost then (given that your database connection sometimes works) it is likely that ping is being blocked by a firewall somewhere: it may help diagnosis if you find that block and temporarily unblock it.
The error message indicates that you are using Named pipes. Are you using Named pipes on purpose? For most scenarios (including Azure database) I'd suggest enabling TCP/IP and disabling Named Pipes, in SQL Server Configuration Manager.
Depending how 'far away' your Azure database is, the delays because of routers and firewalls sometimes upset Kerberos and/or related timings. You can overcome this by using the port in the connection string to avoid the roundtrip to port 1434 to enumerate the instance. I assume you're already using a FQDN. For example: server\instance,port

C# SqlConnections using up entire connection pool

I've written a service that occasionally has to poll a database very often. Usually I'd create a new SqlConnection with the SqlDataAdapter and fire away like this:
var table = new DataTable();
using(var connection = new SqlConnection(connectionString))
{
connection.Open();
using(var adapter = new SqlDataAdapter(selectStatement, connection))
{
adapter.Fill(table);
}
}
However in a heavy load situation (which occurs maybe once a week), the service might actually use up the entire connection pool and the service records the following exception.
System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
Multiple threads in the service have to access the SQL server for various queries and I'd like as much of them to run in parallel as possible (and that obviously works too well sometimes).
I thought about several possible solutions:
I thought about increasing the connection pool size, however that might just delay the problem.
Then I thought about using a single connection for the service and keep that open for the remainder of the service running, which might a simple option, however it will keep the connection open even if there is no workload to be done and would have to handle connection resets by the server etc. which I do not know the effect of.
Lastly I thought about implementing my own kind of pool that manages the number of concurrent connections and keeps the threads on hold until there is a free slot.
What would be the recommended procedure or is there a best practice way of handling this?
Well the solution in the end was not exactly ideal (fixing the issue on the SQL Server side) so I ended up checking the number of concurrent connections in the job queuing system.
The service will now not create another thread for document generation unless it can guarantee that the connection pool is actually available. The bad bottleneck on the SQL server is still in place, however the service now no longer generates exceptions.
The downside of course is, that the queue gets longer while there is some blocking query executing on the SQL Server, which might delay document generation for a minute or two. So it isn't an ideal solution but a workable one, since the delay isn't critical as the documents aren't needed directly but stored for archival purpose.
The better solution would have been to fix it SQL Server side.

Properly shutting down MongoDB database connection from C# 2.1 driver?

I am just getting started with integrating MongoDB into my application and I have ran into a few questions. In my application I am using the newest 2.1 version of the MongoDB C# driver and only using MongoDB for application logging.
Currently before showing my main application Form I first check to see if mongod.exe is running and if not I start it. Then when my main Form is shown it opens a connection to the database for use seen below.
public void Open()
{
Client = new MongoClient("mongodb://localhost:27017");
Database = Client.GetDatabase(DBName);
Collection = Database.GetCollection<BsonDocument>(ColName);
}
My question is how I should properly shutdown this connection when my application is closing?
Also are there in considerations I should take into account in leaving mongod.exe running versus exiting it each time the application closes?
I have searched a few times trying to figure out if there is a proper way to shutdown the connection but have found nothing very specific. There is an old SO post (that I can't seem to find now) mentioning a .Dispose method, though I cannot seem to find it in the newest driver nor from my IDE's auto complete.
As of today's version of MongoDB (v2.0.1.27 for MongoDB.Driver), there's no need to close or dispose of connections. The client handles it automatically.
From the docs:
A MongoClient object will be the root object. It is thread-safe and is all that is needed to handle connecting to servers, monitoring servers, and performing operations against those servers.
[...]
It is recommended to store a MongoClient instance in a global place, either as a static variable or in an IoC container with a singleton lifetime. However, multiple MongoClient instances created with the same settings will utilize the same connection pools underneath.
There's a partial/old list of thread-safe MongoDB classes in this SO answer.
The question seems to have been already kinda asked here at When should i be opening and closing MongoDB connections?
If it's accepted answer,
I would leave the connection open as re-creating the connection is
costly. Mongo is fine with lots of connections, open for a long time.
What you ideally should do is to share the connection with all parts
of your application as a persistent connection. The C# driver should
be clever enough to do this itself, so that it does not create too
many connections, as internally it uses "connection pooling" that
makes it even re-use connections. The docs say: "The connections to
the server are handled automatically behind the scenes (a connection
pool is used to increase efficiency)."
works fine for you then all well and good. Even the MongoDB C# driver's quick tour page lends the same advice -
Typically you only create one MongoClient instance for a given cluster
and use it across your application. Creating multiple MongoClients
will, however, still share the same pool of connections if and only if
the connection strings are identical.
Otherwise, I think you can simply put your call to create the connection in a using(){} code block. It automatically calls the dispose method for you (as it implements the IDisposable pattern). You should use this block for any resource you want disposed.
From my experience, the correct way is as answered, but even following these recommendations, I still was having random EndOfStreamException. It seems that some problems are caused by the internet provider closing the connection after some time.
I Solved it by adding:
MongoClientSettings settings = MongoClientSettings.FromUrl(new MongoUrl(connectionString));
settings.SslSettings = new SslSettings() { EnabledSslProtocols = SslProtocols.Tls12 };
settings.MaxConnectionIdleTime = TimeSpan.FromSeconds(30);

Should SqlConnections be long or shortlived with EF

I have a few DbContextes that connects to the same database in the same application.
I noticed that EF6 has a new constructor: https://msdn.microsoft.com/en-us/library/gg696604(v=vs.113).aspx
My question is then, lets say I hook up my DI framework to create a SqlConnection for each request and passes this to each of the DbContexts with this new constructor instead, would that be the correct way to go about it? Or should the Sql connection be longlived and not per request?
public async Task<SqlConnection> GetOpenedConnectionAsync()
{
_connection = new SqlConnection(_config.AscendDataBaseConnectionString);
await _connection.OpenAsync(_cancel.Token);
return _connection;
}
Register above per application lifetime or per request lifetime?
Depends on your use case but in general i would highly discourage Singleton scope.
Generally the cost of creating a new connection and tearing it down is low, unless there is a long packet delay between Server and Db (e.g. mobile) but if servers are close this is < 5ms.
If lets say you have 1 database, used by a thousand servers (load balancing or whatever), if all those servers always kept an open connection you may run into issues, but if you had each one open and close connections as and when needed, this probably would work.
If you have 1 database, and 1 or 2 servers, you could have a single connection (to save a small amount of time per request) but there are pitfalls and i would HIGHLY discourage it because:
If you open a transaction, no other query will be able to run until that transaction finishes as there can only be 1 transaction at any time per connection. E.g. User A tries to list all Customers (takes 5 seconds), this means no other query can run until you get all the customers back.
If a transactions gets opened, and for whatever reason it does not commit, you will basically loos complete connectivity to the database until that transaction gets rolled back/committed, which may or may not happen.

Establish remote SSL connection after or before local user connection for SSL wrapper?

I'm trying to make a stunnel clone in C# just for fun. The main loop goes something like this (ignore the catch-everything-and-do-nothing try-catches just for now)
ServicePointManager.ServerCertificateValidationCallback = Validator;
TcpListener a = new TcpListener (9999);
a.Start ();
while (true) {
Console.Error.WriteLine ("Spinning...");
try {
TcpClient remote = new TcpClient ("XXX.XX.XXX.XXX", 2376);
SslStream ssl = new SslStream(remote.GetStream(), false, new RemoteCertificateValidationCallback(Validator));
ssl.AuthenticateAsClient("mirai.ca");
TcpClient user = a.AcceptTcpClient ();
new Thread (new ThreadStart(() => {
Thread.CurrentThread.IsBackground = true;
try{
forward(user.GetStream(), ssl); //forward is a blocking function I wrote
}catch{}
})).Start ();
} catch {
Thread.Sleep (1000);
}
}
I found that if I do the remote SSL connection, as I did, before waiting for the user, then when the user connects the SSL is already set up (this is for tunneling HTTP so latency is pretty important). On the other hand, my server closes long-inactive connections, so if no new connection happens in, say, 5 minutes, everything locks up.
What is the best way?
Also, I observe my program generating as much as 200 threads, which of course means that context-switching overhead is pretty big and sometimes results in the whole thing just blocking for seconds, even with just one user tunneling through the program. My forward function goes, in a gist, like
new Thread(new ThreadStart(()=>in.CopyTo(out))).Start();
out.CopyTo(in);
of course with lots of error handling to prevent broken connections from holding up forever. This seems to stall a lot though. I can't figure how to use asynchronous methods like BeginRead which should help according to google.
For any kind of proxy server (including an stunnel clone), opening the backend connection after you accept the frontend connection is clearly much simpler to implement.
If you pre-open backend connections in anticipation of receiving frontend connections, you can certainly save an RTT (which is good for latency), but you have to deal with the issue you hinted at: the backend will close idle connections. At any time that you receive a frontend connections, you run the risk that the backend connection that you are about to associate with this frontend connection and which has been opened some time ago is too old to use and may be closed by the backend. You will have to manage a pool of currently open backend connections and periodically close and refresh them when they become idle for too long. There is even a race condition where if the backend decided the connection has been idle too long and decides to close it but the proxy server receives a new frontend connection at the same time, the frontend may decide to forward a request through the backend connection while the backend is closing this connection. That means that you must be able to know a priori how long backend connections can be idle for before the backend will close them (you must know what the timeout values that are configured on the backend are set to) so you can give them up just before the backend will decide they are too old.
So in summary: pre-opening backend connections will save an RTT versus opening them only on demand, but it is a lot of work, including subtle connection pool management that it quite tough to implement bug-free. Up to you to judge if the extra complexity is worth it.
By the way, concerning your comment about handling several hundred simultaneous connections, I recommend implementing such an I/O-bound program as a proxy server based around an event loop instead of based around threads. Basically, you use non-blocking sockets and process events in a single thread (e.g. "this socket has new data waiting to be forwarded to the other side") instead of spawning a thread for each connection (which can get expensive both in thread creation and context switches). In order to scale such an event-based model to multiple CPU cores, you can start a small number of parallel threads of processes (more or less one per CPU core) which each handle many hundreds (or thousands) of simultaneous connections.

Categories

Resources