So I've run into a little issue that puzzles me and I've not been able to find a good explanation for this - I imagine I'm probably mis-using the async/await feature somehow but I really don't know what I'm doing wrong.
So I have some sql code that queries my database and returns a single value. I was therefore using ExecuteScalarAsync to get that value out into c#.
The code is as follows:
public void CheckOldTransactionsSync()
{
CheckOldTransactions().Wait();
}
public async Task CheckOldTransactions()
{
DateTimeOffset beforeThis = DateTime.UtcNow.Subtract(TimeSpan.FromHours(6));
using (SqlConnection connection = new SqlConnection(SqlConnectionString))
{
await connection.OpenAsync(cts.Token);
using (SqlCommand command = new SqlCommand(#"SELECT TOP 1 1 AS value FROM SyncLog WHERE [TimeStamp] < #BeforeThis;", connection))
{
command.Parameters.Add("#BeforeThis", System.Data.SqlDbType.DateTimeOffset, 7);
command.Prepare();
command.Parameters["#BeforeThis"].Value = beforeThis;
Int32 oldTransactions = (Int32)await command.ExecuteScalarAsync(cts.Token);
// do stuff with oldTransactions
}
}
}
So elsewhere in my code the CancellationTokenSource called cts is created and set to expire after 2 minutes using the CancelAfter method.
Now I've stepped through this code with the debugger and I reach the line where I await the call to ExecuteScalarAsync without a problem. However I seem to have two issues with the execution of that line which are that it doesn't seem to return and 2 it ignores my cancellation token and is still running some time after my two minute cancellation token has expired.
Now I've run the sql query in Sql Studio and it returns very quickly - the table has only around 4000 rows at this time.
I've resolved the problem for now by changing that line to:
Int32 oldTransactions = (Int32) command.ExecuteScalar();
Which returns almost instantaneously.
That is the only line of code I've changed and I changed it back just to make sure and the same issue occurred. So my question is, what did I do wrong with the asynchronous call?
You are calling Wait.
That's a classic ASP.NET deadlock. Don't block, or use synchronous IO.
Related
I am trying to use query cancellation (via cancellation tokens) to cancel a long-running complex query. I have found that in some cases not only does cancellation fail to halt the query but also the call to CancellationToken.Cancel() hangs indefinitely. Here is a simple repro that replicates this behavior (can be run in LinqPad):
void Main()
{
var cancellationTokenSource = new CancellationTokenSource();
var blocked = RunSqlAsync(cancellationTokenSource.Token);
blocked.Wait(TimeSpan.FromSeconds(1)).Dump(); // false (blocked in SQL as expected)
cancellationTokenSource.Cancel(); // hangs forever?!
Console.WriteLine("Finished calling Cancel()");
blocked.Wait();
}
public async Task RunSqlAsync(CancellationToken cancellationToken)
{
var connectionString = new SqlConnectionStringBuilder { DataSource = #".\sqlexpress", IntegratedSecurity = true, Pooling = false }.ConnectionString;
using (var connection = new SqlConnection(connectionString))
{
await connection.OpenAsync().ConfigureAwait(false);
using (var command = connection.CreateCommand())
{
command.CommandText = #"
WHILE 1 = 1
BEGIN
DECLARE #x INT = 1
END
";
command.CommandTimeout = 0;
Console.WriteLine("Running query");
await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
}
}
}
Interestingly, the same query run in SqlServer Management Studio cancels instantly via the "Cancel Executing Query" button.
Is there some caveat to query cancellation where it cannot cancel tight WHILE loops?
My version of SqlServer:
Microsoft SQL Server 2012 - 11.0.2100.60 (X64)
Feb 10 2012 19:39:15
Copyright (c) Microsoft Corporation
Express Edition (64-bit) on Windows NT 6.2 (Build 9200: )
I am running on Windows 10, and .NET's Environment.Version is 4.0.30319.42000.
EDIT
Some additional information:
Here is the stack trace pulled from Visual Studio when cancellationToken.Cancel() hangs:
Another thread is stuck here:
Additionally, I tried updating to SqlServer Express 2017 and I am seeing the same behavior.
EDIT
I've filed this as a bug with corefx: https://github.com/dotnet/corefx/issues/26623
I can reproduce the issue in a console application. (The code in the question was code from LINQPad.)
I'm going to make this an answer and say that this is a bug in ADO.NET. ADO.NET should send a query cancellation signal to SQL Server. I can see from the CPU usage, that SQL Server continues executing the loop. Therefore, it did not receive cancellation from the client. We also know that SSMS is able to cancel this loop.
While the loop is running I can see that the console app is using 50% of one CPU core and receiving data from SQL Server at 70MB/sec. I do not know what data this is. It might be ROWCOUNT information or something related.
I think the bug is related to the fact that the loop is continuously sending data so that ADO.NET never has an opportunity to send the cancellation. It's still a bug and it would be a community service if you reported it. You can link to this question.
If the loop is throttled using ...
WHILE 1 = 1
BEGIN
DECLARE #x INT = 1
WAITFOR DELAY '00:00:01' --new
END
... then cancellation is quick.
Also, you can generally not rely on cancellation being quick. If the network dropped it might take 30sec for the client to notice this and throw.
Therefore you need to code your program so that it continues executing and not wait for the query to finish. It could look like this:
var queryTask = ...;
var cancellationToken = ...;
await Task.WhenAll(queryTask, cancellationToken);
That way cancellation always looks instantaneous. Make sure that resources are still disposed. All SQL interaction should be encapsulated in queryTask so that it simply continues in the background and eventually cleans up.
I'm trying to figure out the best way to batch insert about 37k rows into my Sql Server using DAPPER.
My problem is that when I use Parallel.ForEach - the number of connections to the database increases over a short period of time - finally hitting nearly or about 100 ... which gives connection pool errors. If I force the max degree of parall then it's hit that max number and stays there.
Setting the maxdegree feels wrong.
It currently is doing about 10-20 inserts a second. This is also in a simple Console App - so there's no other database activity besides what's happening in my Parallel.ForEach loop.
Is using Parallel.ForEach the incorrect thing in this case because this is not-CPU bound?
Should I be using async/await ? If so, what stopping this from doing hundreds of db calls in one go?
Sample code which is basically what I'm doing.
var items = GetItemsFromSomewhere(); // Returns 37K items.
Parallel.ForEach(items => item)
{
using (var sqlConnection = new SqlConnection(_connectionString))
{
var result = sqlConnection.Execute(myQuery, new { ... } );
}
}
My (incorrect) understanding of this was that there should on be about 8 or so connections at any time to the db. The Connection Pool will release the connection (which remains instantiated in the Connection Pool, waiting to be used). And if the Execute takes .. i donno .. lets say even a 1 second (the longest running time for an insert was about 500ms .. and that's 1 in every 100 or so) ... that's ok .. that thread is blocked and chills until the Execute completes. Then the scope completes (and Dispose is auto called) and the connection closed. With the connection closed, the Parallel.ForEach then grabs the next item in the collection, goes to the connection pool and then grabs a spare connection (remember - we just closed one, a split second ago) ... rinse.repeat.
Is this wrong?
Notes:
.NET 4.5
Sql 2012
Console app.
Using Dapper.NET for sql code.
First of all: If it is about performance, use SqlBulkCopy. This works with SQL-Server. If you are using other database servers, they might have their own SqlBulkCopy-solution (Oracle has one).
SqlBulkCopy works like a bulk-select: One state opens one connection and streams all the data from the server to the client. With an insert, it works the other way arround: It streams all the new records from the client to the server.
See: https://msdn.microsoft.com/en-us/library/ex21zs8x(v=vs.110).aspx
If you insist of using parallellism, you might want to consider the follow code:
void BulkInsert<T>(object p)
{
IEnumerator<T> e = (IEnumerator<T>)p;
using (var sqlConnection = new SqlConnection(_connectionString))
{
while(true)
{
T item;
lock(e)
{
if (!e.MoveNext())
return;
item = e.Current;
}
var result = sqlConnection.Execute(myQuery, new { ... } );
}
}
}
Now create your own threads and invoke this method on these threads with one and the same parameter: The iterator which runs through your collection. Each threat opens its own connection once, starts inserting, and after all items are inserted, the connection is closed. This solutions uses as many connections as your created threads.
PS: Multiple variants of above code are possible . You could call it from background threads, from Tasks, etc. I hope you get the point.
You should use SqlBulkCopy instead of inserting one by one. Faster and more efficient.
https://msdn.microsoft.com/en-us/library/ex21zs8x(v=vs.110).aspx
credits to the answer owner
Sql Bulk Copy/Insert in C#
I am trying to make an asynchronous call to Oracle, but it gets executed synchronously. Please look at below code and tell me what I am doing wrong.
(I've installed ODAC (ODTwithODAC1120320_32bit.zip) and use the Oracle.DataAccess.dll assembly for my calls to Oracle. Before I used the deprecated System.Data.OracleClient with the same result.)
using System;
...
using System.Threading;
using System.Threading.Tasks;
using Oracle.DataAccess.Client;
namespace OracleTest
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
async private void button1_Click(object sender, EventArgs e)
{
OracleConnection connection = new OracleConnection("User Id=myuser;Password=mypwd;Data Source=mydb");
connection.Open();
OracleCommand command = new OracleCommand("select count(col) from bigtable", connection);
Task<Object> result = command.ExecuteScalarAsync();
label1.Text = "BEFORE" + DateTime.Now.ToLocalTime() + " - ";
label1.Text += await result;
label1.Text += " - AFTER " + DateTime.Now.ToLocalTime();
connection.Close();
connection.Dispose();
}
}
}
It takes some minutes for the dbms to get the count. What I expect is this: ExecuteScalarAsync gets called and it gives Oracle a call. Immediately after the BEFORE time is written to label1. Then I wait for the Oracle query to finish and take its result. Then the AFTER time is written to label1. So BEFORE and AFTER should be different. However, they are always the same time (i.e. the time when the query returned its result). Why is that?
I also tried
CancellationToken cancellationToken = new CancellationToken();
Task<Object> result = command.ExecuteScalarAsync(cancellationToken);
and it didn't change anything. (What is this supposed to do anyhow? Would I not simply call command.Cancel(); instead of using a CancellationToken?)
My system: Windows 8 Pro 64bit, Visual Studio Express 2013, Oracle Client 11g (32bit): OCI 11.2.0.01
As far as I know, Oracle's provider still doesn't implement the asynchronous methods. This was asked previously and I can't find anything newer in Oracle's OTN or the discussion forums.
As the answer to the previous question says, the default implementation of the Async methods is to call the synchronous counterparts rather than run them wrapped in Tasks (which could actually result in worse performance actually).
If you look at the docs for ExecuteScalarAsync it clearly states that it "Implements the asynchronous version of ExecuteScalar, but returns a Task synchronously, blocking the calling thread.. Thus it appears that it's doing precisely what it says it does (blocking the calling thread).
To take advantage of ExecuteScalarAsync you need to do something like
using (object obj = await command.ExecuteScalarAsync())
{
//....
}
Best of luck.
Given:
A BenchMark class that lets me know when something has completed.
A very large XML file (~120MB) that has been parsed into multiple Lists
Some code:
SqlConnection con = null;
SqlTransaction transaction = null;
try
{
con = getCon(); // gets a new connection object
con.Open();
transaction = con.BeginTransaction();
var bulkCopy = new SqlBulkCopy(con, SqlBulkCopyOptions.Default, transaction)
{
BatchSize = 1000,
DestinationTableName = "Table1"
};
// assume that the BenchMark class is working
b = new BenchMark("Table1");
bulkCopy.WriteToServer(_insertTable1s.AsDataReader()); // _insertTables1s is a List<Table1>
b.Complete();
LogHelper.WriteLogItem(b);
b = new BenchMark("Table2");
bulkCopy.DestinationTableName = "Table2";
bulkCopy.WriteToServer(_insertTable2s.AsDataReader()); // _insertTables2s is a List<Table2>
b.Complete();
LogHelper.WriteLogItem(b);
// etc... this code does a batch insert into about 7 tables all having about 40,000 records being inserted.
b = new BenchMark("Transaction Commit");
transaction.Commit();
b.Complete();
}
catch (Exception e)
{
transaction.Rollback();
LogHelper.WriteLogItem(
LogLevel.Critical,
LogType.DataProcessing,
e.ToString());
}
finally
{
con.Close();
}
The Problem:
On my local development environment, everything is fine. Its when I run this operation in the cloud that causes it to hang. Using the LogHelper.WriteLogItem method, I can watch the progress of this process. I observe it hang randomly on a particular table. No exception is thrown so the transaction isn't rolled back. Say it hangs on Table2 bulk insert. Using MS SQL Management Studio, I run queries on Table3, Table2 and Table1 with no issue (this means that the transaction was aborted?)
Since it hangs, I'll go rerun the process. This time it hangs sooner so I might get logs like this:
7755 Benchmark LoadXML took 00:00:04.2432816
7756 Benchmark Table1 took 00:00:06.3961230
7757 Benchmark Table2 took 00:00:05.2566890
7758 Benchmark Table3 took 00:00:08.4900921
7759 Benchmark Table4 took 00:00:02.0000123
... it hangs on Table5 (because the BenchMark never completed). I go to run it again and the rest of the log looks like:
7780 Benchmark LoadXML took 00:00:04.1203923
... and it hangs here now.
I'm using rackspace cloud hosting if that helps. I have been able to fix this in the past by deleting all the tables from my dbml file and readding them but this time its not working. I'm wondering if the amount of data being processed is causing the problem?
EDIT: The code in this example is run in an Asynchronous thread. I've found out that the Thread is Aborting for an unknown reason and I need to find out why to solve this problem.
If you have admin to your server or database, you can run
SELECT * FROM sys.dm_tran_session_transactions
to see what transactions are currently active - From Pinal
Additionally, you can run sp_lock to make sure there isn't something blocking your transaction.
Because this process is done asynchronously (i.e. a thread is kicked off to handle this) the thread has a problem which aborts it and that is why I get strange behavior where the code stalls at different places. I've solved this by completing this task synchronously (it works but its not ideal).
I guess the real issue is why my thread is aborting since I'm not aborting it in any of my code. I believe that its due to amount of data that is being processed, but I could be wrong.
Either way, I've solved my problem.
I have the following code:
using (SqlConnection sqlConnection = new SqlConnection("blahblah;Asynchronous Processing=true;")
{
using (SqlCommand command = new SqlCommand("someProcedureName", sqlConnection))
{
sqlConnection.Open();
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("#param1", param1);
command.BeginExecuteNonQuery();
}
}
I never call EndExecuteNonQuery.
Two questions, first will this block because of the using statements or any other reason? Second, will it break anything? Like leaks or connection problems? I just want to tell sql server to run a stored procedure, but I don't want to wait for it and I don't even care if it works. Is that possible? Thanks for reading.
This won't work because you're closing the connection while the query is still running. The best way to do this would be to use the threadpool, like this:
ThreadPool.QueueUserWorkItem(delegate {
using (SqlConnection sqlConnection = new SqlConnection("blahblah;Asynchronous Processing=true;") {
using (SqlCommand command = new SqlCommand("someProcedureName", sqlConnection)) {
sqlConnection.Open();
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("#param1", param1);
command.ExecuteNonQuery();
}
}
});
In general, when you call Begin_Whatever_, you usually must call End_Whatever_ or you'll leak memory. The big exception to this rule is Control.BeginInvoke.
You can't close the connection after you submit the BeginExceuteNotQuery. It will abort the execution. Remove the using block.
In order to close the connection, you must know when the call has completed. For that you must call EndExecuteNonQuery, usually from a callback:
.
command.BeginExecuteNonQuery(delegate (IAsyncResult ar) {
try { command.EndExecuteNonQuery(ar); }
catch(Exception e) { /* log exception e */ }
finally { sqlConnection.Dispose(); }
}, null);
If you want to submit a query and don't care about the results, see Asynchronous T-SQL execution for a reliable pattern that ensures execution even if client diconnects or crashes.
You should always call the EndExecuteNonQuery() method to prevent leaks. It may work now but who knows what will happen in future versions of .NET. The general rule is always follow a BeginExecute... with an EndExecute...
I know this is an old post; just adding my 2c based on our recent (very conclusive) implementation and testing :D
To answer the OP's questions:
If you don't call EndExecuteNonQuery, BeginExecuteNonQuery will execute the procedure, but the operation will be cancelled as soon as the using clause disposes of your sql connection. Hence this is not plausible.
If you call BeginExecuteNonQuery by using a delegate, creating a new thread etc and you do not call EndExecuteNonQuery, chances are good you might get memory leaks depending on what takes place in you stored procedure. (More on this later).
Calling an stored procedure and not waiting for the call to complete, as far I our testing went, is not possible. Irrespective of multitasking, something somewhere will have to wait.
On to our solution:
Refs: BeginExecuteNonQuery -> BENQ, EndExecuteNonQuery -> EENQ
Use Case:
We have a windows service (C#) that makes use of the .Net TPL library. We needed to load data with a stored procedure from one database to another at run time, based on a add hoc request that the service picks up. Our stored procedure had an internal transaction and exception handling with try catch blocks.
First Try:
For our first try we implemented a solution found here MS Solution in this example you will see that MS opts to call BENQ then implements a while loop to block execution and then calls EENQ. This solution was mainly implemented if you don't need a callback method. The problem with this solution is that only BENQ is ignorant to sql connection timeouts. EENQ will timeout. So for a long running query (which is hopefully the reason why you are using BENQ) you will get stuck in the while and once the operation has completed and you call EENQ, you will get an sql timeout connection.
Second Try:
For our second try we thought ok so lets call BENQ, then add a while so that we don't close our sql connection and never call EENQ. This worked, until an exception was thrown in our stored procedure. Because we never called EENQ, the operation was never completed and the exception never bubbled up to our code. Hence we were stuck in a loop/thread/memory leak forever.
Third Try: (The Solution)
For our third try we thought to call BENQ, then directly after call EENQ. What happened was that EENQ effectively blocked execution in the thread until the operation completed. When an exception occurred in the stored procedure it was caught. When the query ran long EENQ did not throw a timeout exception and in all cases our sql connection object was disposed as well as our thread.
Here are some extracts of our code:
Here we open up a new thread for the method that calls the stored procedure.
//Call the load data stored procedure. As this stored procedure can run longer we start it in its own thread.
Task.Factory.StartNew(() => ClassName.MethodName(Parameters));
This is the code inside the method we use to call the stored procedure.
//Because this is a long running stored procedure, we start is up in a new thread.
using (SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings[ConfigurationManager.AppSettings["ConnectionStringName"]].ConnectionString))
{
try
{
//Create a new instance SqlCommand.
SqlCommand command = new SqlCommand(ConfigurationManager.AppSettings["StoredProcedureName"], conn);
//Set the command type as stored procedure.
command.CommandType = CommandType.StoredProcedure;
//Create input parameters.
command.Parameters.Add(CreateInputParam("#Param1", SqlDbType.BigInt, Param1));
command.Parameters.Add(CreateInputParam("#Param2", SqlDbType.BigInt, Param3));
command.Parameters.Add(CreateInputParam("#Param3", SqlDbType.BigInt, Param3));
//Open up the sql connection.
conn.Open();
//Create a new instance of type IAsyncResult and call the sp asynchronously.
IAsyncResult result = command.BeginExecuteNonQuery();
//When the process has completed, we end the execution of the sp.
command.EndExecuteNonQuery(result);
}
catch (Exception err)
{
//Write to the log.
}
}
I hope this answer save's someone some headache :D We have tested this thoroughly and have not experienced any issues.
Happy coding!
In this case the using statements won't be necessary because you should manually close it yourself rather than allowing the syntactic sugar dispose it for you (i.e. at the }).
It should be as simple as this to ensure you don't have leaks.
using (SqlConnection sqlConnection = new SqlConnection("blahblah;Asynchronous Processing=true;")
{
using (SqlCommand command = new SqlCommand("someProcedureName", sqlConnection))
{
sqlConnection.Open();
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("#param1", param1);
command.BeginExecuteNonQuery((ar) =>
{
var cmd = (SqlCommand)ar.AsyncState;
cmd.EndExecuteNonQuery(ar);
cmd.Connection.Close();
}, command);
}
}
As you can see the lambda expression that is fired once the command is finished (no matter how long it takes) will do all the closing for you.