Inconsistent behaviour with distributed transaction (MSDTC exception)

Inconsistent behaviour with distributed transaction (MSDTC exception) - c#

You may think this has been asked before, but hear me out first.
I have a project that runs across different servers with remote connections.
On the development server, all tests pass.
In the QA server that has same configuration, most tests pass but a couple fail and the exception is
The underlying provider failed on Open.
Inner exception:
Network access for Distributed Transaction Manager (MSDTC) has been
disabled.
Please enable DTC for network access in the security configuration for
MSDTC using the Component Services Administrative tool.
Inner inner exception:
The transaction manager has disabled its support for remote/network transactions.
(Exception from HRESULT: 0x8004D024)
As far as I can see the configuration is correct and in fact the tests that pass run the exact same code without issues so there are no configuration issues related to MSDTC.
The piece of code that causes this is an entity framework connection that adds a simple row to a database table:
using (var inbound = new InBound.InBound())
{
var now = DateTime.Now;
var inMessage = inbound.InMessages.Create();
inMessage.Content = message.ToXmlFormattedString();
inMessage.Created = now;
inMessage.SerializedVersion = version;
inMessage.Updated = now;
inbound.InMessages.Add(inMessage);
inbound.SaveChanges(); //Exception here
}
This same piece of code does its job on some tests, but fails in a few others.
This problem only occurs on the QA environment. All MSDTC related configuration has already been checked and appears correct. DEV environment has the same configuration and this error doesn't happen.
The environment is identical. Microsoft Server 2012 R2, MS SQL Server 2008, same codebase, same settings.
The xml in the message is the same in both the test that fails and the test that passes.
I've exhausted the list of places I can look for a solution and there seems no logical explanation at all for this.
Any ideas?
UPDATE
If I add "Enlist=false" in the connection string then the tests pass again.
However I do get an internal exception that gets drowned and that is:
Cannot enlist in the transaction because a local transaction is in
progress on the connection. Finish local transaction and retry.
so the connection must be open somehow, even though during debugging the state of the database is "closed".

Related

Can't initialize local SQL Server database with EFCore commandline tools

I've been trying to follow several different tutorials with EFCore and .net core and I've been totally blocked at the point where I try and create a local database.
I've used both the powershell tools and the commandline tools to try and create an initial migration (or do anything, really).
I consistently get the error:
System.InvalidOperationException: An exception has been raised that is likely due to a transient failure. Consider enabling transient error resiliency by adding 'EnableRetryOnFailure()' to the 'UseSqlServer' call.
---> Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: Named Pipes Provider, error: 0 - No process is on the other end of the pipe.)
The database does not currently exist on the system, though local SQL Server appears to be up and running.
Here is the c# code for adding the context:
services.AddDbContextPool<TestDbContext>(options =>
options.UseSqlServer(Configuration.GetConnectionString("TestDb")
)
);
This is the connection string code:
"TestDb": "Data Source=(localdb)\\MSSQLLocalDB;Initial Catalog=TestDb"
I get similar errors whether I run add-migration, dotnet ef migration add, or dotnet ef dbcontext info. (note: with the dotnet calls I am using the -s ..\{webproject}\{webproject}.csproj property
I've also messed with the connection string by adding various combinations of Trusted_Connection=True; MultipleActiveResultSets=True;, and Integrated Security=true.
I've gone into SSMS and ensured the Server authentication is SQL Server and Windows Authentication Mode and that Maximum Connections is set to 0 (unlimited). I've also gone to logins and tried adding the user to pretty much all the server roles.
So, yeah, I'm pretty confused. I've worked with EF for years, though this is my first experience with EFCore and I'm definitely more of a developer than a SQL Admin. This is also my first time trying to use the local db on this particular computer.
Edit: Looking at error.log in AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\mssqllocaldb I see this error:
2020-01-28 10:15:03.50 Logon Error: 18456, Severity: 14, State: 38.
2020-01-28 10:15:03.50 Logon Login failed for user 'LAPTOP-NC6HQ4TB\ripli'. Reason: Failed to open the explicitly specified database 'TestDb'. [CLIENT: <named pipe>]
Which is confusing. Of course I can't open the specified database. The entire point is I want to create a DB that doesn't yet exist.

Found the answer. Sorry to everyone who tried to help, as you wouldn't have had enough information to solve it.
In the DbContext I had tried to add some code to the constructor to try and populate some data to the database as part of a test. This caused several problems. If the Database hadn't yet been created it tried to connect to the DB before it had been created, which caused the problems I described.
Furthermore, if I had created the db manually it would try to access the DbSets (which had not yet been created), and then complain that the set name was invalid (which, at this point it was.
This all might have been fine if the DB had been created in advance, but since I was using the DbContext to construct the database, it understandably caused problems.
And all of this headache would have been avoided had I not violated SRP and not tried to (even temporarily) hijack a context constructor to hack in some test data.
The takeaway here? Don't pollute your constructors with unrelated hacks. Bleh.

Proper way to deal with database connectivity issue

I getting below error on trying to connect with the database :
A network-related or instance-specific error occurred while
establishing a connection to SQL Server. The server was not found or
was not accessible. Verify that the instance name is correct and that
SQL Server is configured to allow remote connections. (provider: Named
Pipes Provider, error: 40 - Could not open a connection to SQL Server)
Now sometimes i get this error and sometimes i dont so for eg:When i run my program for the first time,it open connection successfully and when i run for the second time i get this error and the next moment when i run my program again then i dont get error.
When i try to connect to same database server through SSMS then i am able to connect successfully but i am getting this network issue in my program only.
Database is not in my LOCAL.Its on AZURE.
I dont get this error with my local database.
Code :
public class AddOperation
{
public void Start()
{
using (var processor = new MyProcessor())
{
for (int i = 0; i < 2; i++)
{
if(i==0)
{
var connection = new SqlConnection("Connection string 1");
processor.Process(connection);
}
else
{
var connection = new SqlConnection("Connection string 2");
processor.Process(connection);
}
}
}
}
}
public class MyProcessor : IDisposable
{
public void Process(DbConnection cn)
{
using (var cmd = cn.CreateCommand())
{
cmd.CommandText = "query";
cmd.CommandTimeout = 1800;
cn.Open();//Sometimes work sometimes dont
using (var reader = cmd.ExecuteReader(CommandBehavior.CloseConnection))
{
//code
}
}
}
}
So i am confused with 2 things :
1) ConnectionTimeout : Whether i should increase connectiontimeout and will this solve my unusual connection problem ?
2) Retry Attempt Policy : Should i implement retry connection mechanism like below :
public static void OpenConnection(DbConnection cn, int maxAttempts = 1)
{
int attempts = 0;
while (true)
{
try
{
cn.Open();
return;
}
catch
{
attempts++;
if (attempts >= maxAttempts) throw;
}
}
}
I am confused with this 2 above options.
Can anybody please suggest me what would be the better way to deal with this problem?

Use a new version of .NET (4.6.1 or later) and then take advantage of the built-in resiliency features:
ConnectRetryCount, ConnectRetryInterval and Connection Timeout.
See the for more info: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-connectivity-issues#net-sqlconnection-parameters-for-connection-retry

All applications that communicate with remote service are sensitive to transient faults.
As mentioned in other answers, if your client program connects to SQL Database by using the .NET Framework class System.Data.SqlClient.SqlConnection, use .NET 4.6.1 or later (or .NET Core) so that you can use its connection retry feature.
When you build the connection string for your SqlConnection object, coordinate the values among the following parameters:
ConnectRetryCount: Default is 1. Range is 0 through 255.
ConnectRetryInterval: Default is 1 second. Range is 1 through 60.
Connection Timeout: Default is 15 seconds. Range is 0 through 2147483647.
Specifically, your chosen values should make the following equality true:
Connection Timeout = ConnectRetryCount * ConnectionRetryInterval
Now, Coming to option 2, when you app has custom retry logic, it will increase total retry times - for each custom retry it will try for ConnectRetryCount times. e.g. if ConnectRetryCount = 3 and custom retry = 5, it will attempt 15 tries. You might not need that many retries.
If you only consider custom retry vs Connection Timeout:
Connection Timeout occurs usually due to lossy network - network with higher packet losses (e.g. cellular or weak WiFi) or high traffic load. It's up to you choose best strategy of using among them.
Below guidelines would be helpful to troubleshoot transient errors:
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-connectivity-issues
https://learn.microsoft.com/en-in/azure/architecture/best-practices/transient-faults

As you can read here a retry logic is recommended even for a SQL Server installed on an Azure VM (IaaS).
FAULT HANDLING: Your application code includes retry logic and
transient fault handling? Including proper retry logic and transient
fault handling remediation in the code should be a universal best
practice, both on-premises and in the cloud, either IaaS or PaaS. If
this characteristic is missing, application problems may raise on both
Azure SQLDB and SQL Server in Azure VM, but in this scenario the
latter is recommended over the former.
An incremental retry logic is recommended.
There are two basic approaches to instantiating the objects from the application block that your application requires. In the first approach, you can explicitly instantiate all the objects in code, as shown in the following code snippet:
var retryStrategy = new Incremental(5, TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2));
var retryPolicy =
new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(retryStrategy);
In the second approach, you can instantiate and configure the objects from configuration data as shown in the following code snippet:
// Load policies from the configuration file.
// SystemConfigurationSource is defined in
// Microsoft.Practices.EnterpriseLibrary.Common.
using (var config = new SystemConfigurationSource())
{
var settings = RetryPolicyConfigurationSettings.GetRetryPolicySettings(config);
// Initialize the RetryPolicyFactory with a RetryManager built from the
// settings in the configuration file.
RetryPolicyFactory.SetRetryManager(settings.BuildRetryManager());
var retryPolicy = RetryPolicyFactory.GetRetryPolicy
<SqlDatabaseTransientErrorDetectionStrategy>("Incremental Retry Strategy");
...
// Use the policy to handle the retries of an operation.
}
For more information, please visit this documentation.

Consider using Polly.
You could use a simple piece of code like -
RetryPolicy retryPolicy = Policy.Handle<Exception>()
.WaitAndRetry(3, retryAttempt =>
TimeSpan.FromSeconds(retryAttempt));
var result = retryPolicy.Execute(() => someClass.DoSomething());
This will retry the request up to three times.

It is completely possible that a connection can drop. "Fallacies of Distributed Computing" :).
It could be network connectivity issue. Could be at any end.
I would recommend: (assuming firewall is enabled for your machine on Azure)
Ping the server and see if there is any loss.
ping (server).database.windows.net
tracert
telnet can also be your friend.
The above three should help you to pin-point where the problem is.
I think your retry logic is fine.
Regarding you question
Increase Timeout
Only if you are sure that your query will take long time. If for a simple insert you have to increase timeout problem could be network connectivity.
Retry Logic
As already posted, it's now part of framework which you can utilise or the one you created should be fine. Ideally, it's good to have retry logic, even if you are sure about connectivity and speed. Just in case :)

You should increase the timeout because the time taken to establish a connection to a SQL server has many steps, hence it takes some time when it goes for establishing the connection for the first time. After establishment of the connection, the connection is pooled in the memory for re-use in subsequent queries.
Please refer below link for more detailed understanding on connection-pooling:
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling
As you mentioned that this error generates sometimes, and not always, so there might be some network and connectivity factors for that. The default timeout for SQL connection is 15 seconds. I think if you change it to 30 seconds, it should work.

Are you using SQL Express or Workgroup Edition? If so, it's possible that the server is too busy to respond.
To rule out network problems, from a command prompt, do a PING -t SqlServername. Does every ping come back, or are some lost? This can be an indicator of network interruptions that might also cause this error, like a faulty switch. If they are all lost then (given that your database connection sometimes works) it is likely that ping is being blocked by a firewall somewhere: it may help diagnosis if you find that block and temporarily unblock it.
The error message indicates that you are using Named pipes. Are you using Named pipes on purpose? For most scenarios (including Azure database) I'd suggest enabling TCP/IP and disabling Named Pipes, in SQL Server Configuration Manager.
Depending how 'far away' your Azure database is, the delays because of routers and firewalls sometimes upset Kerberos and/or related timings. You can overcome this by using the port in the connection string to avoid the roundtrip to port 1434 to enumerate the instance. I assume you're already using a FQDN. For example: server\instance,port

c# EntityFramework: TransactionScope over two different databases on same server with DTC

I have basicly the same Situation as here described here Multiple databases(datacontext) on same server without MS DTC
like:
using (var transaction = new TransactionScope())
{
using (var entities = new ContextA())
{
// do smth. here
}
using (var entities = new ContextB())
{
// do smth. there
}
}
Both, ContextB and ContextA are located on the same MS SQL Server (V. 13.0.4206.0). The c# code is executed from remote Workstation. Server and Worksation are both in same domain network.
But when ContextB is trying to do it's first manipulation, it raises the following error:
The MSDTC transaction manager was unable to pull the transaction from
the source transaction manager due to communication problems. Possible
causes are: a firewall is present and it doesn't have an exception for
the MSDTC process, the two machines cannot find each other by their
NetBIOS names, or the support for network transactions is not enabled
for one of the two transaction managers. (Exception from HRESULT:
0x8004D02B)
It does not make any difference if I change the order of tasks; say first doing smth. on ContextB, then on ContextA: In that case the error raises when manipulating database from ContextA.
Without transaction scope all works fine (but of course without transaction).
I've checked firewall settings on Server: Predefined rule for Distributed Transaction Coordinator are enabled for domain and private network. I've checked also DTC properites:
Network DTC Access: true
Allow Inbound: true
AllowOutbound: true
What is wrong? It looks so simple, did I oversee something?
Thank you for your answers.

Solved: Needs to activate predefined firewall rule 'Distributed Transaction Coordinator' on Worksation, where the c# code runs, too.
I wasn't aware that the database server will open a connection back to the transaction initializer (which was my workstation). It seems in this situation, that the transaction initializer inherits the role as coordinator - not the server.
Thank you for the given hints!

NLog within TransactionScope causing Transaction to be invalid

I am having a problem in a production environment that I am not getting locally.
I am running some LINQ to SQL code within a TransactionScope as below:
using (var scope = new TransactionScope())
{
uploadRepository.SubmitChanges();
result = SubmitFileResult.Succeed();
ScanForNewData(upload);
scope.Complete();
}
ScanForNewData() calls GetSubmittedData(). If an exception occurs in GetSubmitted() we use Nlog to write the error to file, database and also send an email:
catch (Exception ex)
{
//MT - having to comment this out beause it is causing a problem with transactions on theproduction server
logger.ErrorException(String.Format("Error reading txt file {0} into correct format", upload.DocumentStore.FileName), ex);
return new UploadGetSubmittedDataResult { Exception = ex, Success = false, Message = String.Format("Error reading txt file {0} into correct format", upload.DocumentStore.FileName) };
}
In ScanForNewData we then call repository.SubmitChanges().This then causes:
The operation is not valid for the state of the transaction. System.Transactions.TransactionException TransactionException System.Transactions.TransactionException: The operation is not valid for the state of the transaction.
The best idea I have come up with is that in production this code is running on a web server and calling a separate database server. Both the DataContext and Nlog have the same connectionstring configuration and Sql user, but maybe because the server is remote (whereas locally I am using integrated security) something strange is happening.
Any idea what happens to the transaction in this scenario?
Update - I just tried it with SQL user locally and it still works fine. Must be something to do with the production set up...
Another update - I tell a lie. On the dev maching the Nlog database record is never written, the email is sent, and the TransactionException does not happen.

Hard to guess what is the problem without a full stack trace of the exception, it may depend on multiple things.
For instance, I'm assuming NLog opens a new connection to the db my himself, and that will probably cause the transaction to be promoted to a distributed one, and the Distributed Transaction Coordinator will kick in. This can cause the asymmetry between the behavior of your application in production and locally.
You may be breaking the transaction with some operation inside it, like some unhandled exception or illegal accessing of some data.
Provide full stack trace and more code involved for a deeper analysis.

Without knowing what the inner exceptions off of your TransactionException is it will be difficult to resolve but here is a thought:
If you refactor your code to have the logging occur after the using block around the transaction scope has ended you will likely avoid the issue you are having since the transaction scope will be ended and DTC will roll back the transaction.
I have used and seen this pattern in the past (don't log until after the transaction is ended and rolled back) when dealing with transactions and it has worked well.
Doing logging on a separate database is always advisable to avoid issues like this as well. If you did this the issue would also be avoided.

Have a look at this..seems to be a bug with Nlog.
https://groups.google.com/forum/#!msg/nlog-users/I5UR-bISlTA/6RPtOZhR4NoJ
suggested solution is to use async target for Db logging.

ASP.NET SqlConnection Timeout issue

I have run into a frustrating issue which I originally thought was a connection leak but that does not seem to be the case. The secnario is this: the data access for this application is using the Enterprise Libraries (v4) from Microsoft. All data access calls are wrapped in using statements such as
using (DbCommand dbCommand = db.GetStoredProcCommand("sproc"))
{
db.AddInParameter(dbCommand, "MaxReturn", DbType.Int32, MaxReturn);
...more code
}
Now the index of this application makes 8 calls to the database to load everything and I can bring the application to its knees by refreshing the index about 15 times. It seems that when the the database reaches 113 connections is when I recieve this error. Here is what makes this weird:
I have run similar code with the entlib on high traffic sites and have NEVER had this problem ever.
If I kill all the connections to the database and get the production application back up and running everytime I refresh the application I can run this SQL
SELECT DB_NAME(dbid) as 'Database Name',
COUNT(dbid) as 'Total Connections'
FROM sys.sysprocesses WITH (nolock)
WHERE dbid > 0
GROUP BY dbid
I can see the number of connections actively increasing with each page refresh. Running the same code on my local box with the same connection string does not cause this problem. Further if the production website is down I can fire up the site via Visual Studio and run it fine and the only difference between the two is that the production site has Windows authentication turned on and my local copy doesn't. Turning windows authentication off seems to have no effect on the server.
I have absolutely no clue what is causing this or why the connections are not being disposed of in SQL Server. The EntLib objects do no explose .Close() methods for anything so I can't explictily close the object.
Any thoughts?
Thanks!
Edit
Wow I just noticed that I never actually posted the error message. Oy. The actual connection error is: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

Check that the stored procedure you are executing is not running into a row or table lock. Also if you can possibly try to deploy in another server and check if the application would crawl again.
Also try to increase the maximum allowed connections for your SQL server.

think the “Timeout Expired” error is a general issue and may have seveal causes. Increasing the TimeOut can solve some of them but not all.
You may also refer to the following links to troubleshoot and fix the error
http://techielion.blogspot.com/2007/01/error-timeout-expired-timeout-period.html

Could it be a configuration issue on the server?
How do you make a connection to the database on the production server?
That might be an area worth looking into.

While I don't know the answer I can suggest that for some reason connections are not being closed by you application when run in production. (Stating the obvious)
You might want examine your network configuration between the web server and sql server. High latency networks can cause connections not being closed in time.
Also it might help looking at the performance counters listed in the end of the following msdn article:
http://msdn.microsoft.com/en-us/library/8xx3tyca%28VS.71%29.aspx
Finally, if nothing else helps, I'd get debugger and Enterprise Library source code on production and debug your code inside the enterprise library to find out why connections are not being closed.
Silly question are you properly closing your DataReader? If not this could be the problem and the difference in behaviour between dev and prod can be caused by different garbage collection patterns.

I would disable connection pooling and try to suppress it (heh). Just add ";Pooling=false" to your connection string.
Or, perhaps you could add something like the following 'cleanup' code to your page (which closes any connection left open when the page unloads) - right in the 'using' clause:
System.Web.UI.Page page = HttpContext.Current.Handler as System.Web.UI.Page;
if (page != null) {
page.Unload += (EventHandler)delegate(object s, EventArgs e) {
try {
dbCommand.Connection.Close();
} catch (Exception) {
} finally {
result = null;
}
};
}
Also, make sure you've enabled the 'shared memory' protocoll if your SQL server and IIS are on the same machine (a real performance booster)!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.