I have a program that loads a large quantity of data (~800K-1M rows per iteration) in a Task running on the threadpool (see offending code sample below); no more than 4 tasks running concurrently. This is the only place in the program that a connection is made to this database. When running the program on my laptop (and other coworkers identical laptops), the program functions perfectly. However, we have access to another workstation via remote desktop that is substantially more powerful than our laptops. The program fails about 1/3 to 1/2 of the way through its list. All of the tasks return an exception.
The first exception was: "Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached." I've tried googling, binging, searching on StackOverflow, and banging my head against the table trying to figure out how this can be the case. With no more than 4 tasks running at once, there shouldn't be more than 4 connections at any one time.
In response to this, I tried two things: (1) I added a try/catch around the conn.Open() line that would clear the pool if InvalidOperationException appears--that appeared to work [didn't let it run all the way through, but got substantially past where it did before], but at the cost of performance. (2) I changed ConnectionTimeout to be 30 seconds instead of 15, which did not work (but let it proceed a little more). I also tried at one point to do ConnectRetryInterval=4 (mistakenly choosing this instead of ConnectRetryCount)--this let to a different error "The maximum number of requests is 4,800", which is strange because we still shouldn't be anywhere near 4,800 requests or connections.
In short, I'm at a loss because I can't figure out what is causing this connection leak only on a higher speed computer. I am also unable to get Visual Studio on that computer to debug directly--any thoughts anyone might have on where to look to try and resolve this would be much appreciated.
(Follow-up to c# TaskFactory ContinueWhenAll unexpectedly running before all tasks complete)
private void LoadData()
{
SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder();
builder.DataSource = "redacted";
builder.UserID = "redacted";
builder.Password = "redacted";
builder.InitialCatalog = "redacted";
builder.ConnectTimeout = 30;
using (SqlConnection conn = new SqlConnection(builder.ConnectionString))
{
//try
//{
// conn.Open();
//} catch (InvalidOperationException)
//{
// SqlConnection.ClearPool(conn);
// conn.Open();
//}
conn.Open();
string monthnum = _monthsdict.First((x) => x.Month == _month).MonthNum;
string yearnum = _monthsdict.First((x) => x.Month == _month).YearNum;
string nextmonthnum = _monthsdict[Array.IndexOf(_monthsdict, _monthsdict.First((x) => x.Month == _month))+1].MonthNum;
string nextyearnum = _monthsdict[Array.IndexOf(_monthsdict, _monthsdict.First((x) => x.Month == _month)) + 1].YearNum;
SqlCommand cmd = new SqlCommand();
cmd.Connection = conn;
cmd.CommandText = #"redacted";
cmd.Parameters.AddWithValue("#redacted", redacted);
cmd.Parameters.AddWithValue("#redacted", redacted);
cmd.Parameters.AddWithValue("#redacted", redacted);
cmd.CommandTimeout = 180;
SqlDataReader reader = cmd.ExecuteReader();
while(reader.Read())
{
Data data = new Data();
int col1 = reader.GetOrdinal("col1");
int col2 = reader.GetOrdinal("col2");
int col3 = reader.GetOrdinal("col3");
int col4 = reader.GetOrdinal("col4");
data.redacted = redacted;
data.redacted = redacted;
data.redacted = redacted;
data.redacted = redacted;
data.redacted = redacted;
data.Calculate();
_data.Add(data); //not a mistake, referring to another class variable
}
reader.Close();
cmd.Dispose();
conn.Close();
conn.Dispose();
}
}
This turned out to be a classic case of not reading the documentation closely enough. I was trying to cap max Threads at 4, using ThreadPool.SetMaxThreads, but max Threads cannot be less than the number of processors. On the workstation it failed on, it has 8 processors. So, there was never a cap, it was running as many tasks as the Task Scheduler felt appropriate, and it was eventually hitting the Connection Pool limit.
https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool.setmaxthreads
Related
private void NpgSqlGetContracts(IList<Contract> con)
{
var conn = (NpgsqlConnection)Database.GetDbConnection();
List<Contract> contracts = new List<Contract>();
using (var cmd = new NpgsqlCommand("SELECT * FROM \"Contracts\";", conn))
{
cmd.CommandTimeout = 1;
cmd.Prepare();
int conCount= cmd.ExecuteNonQuery();
using (var reader = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
while (reader.Read())
{
contracts.Add(MapDataReaderRowToContract(reader));
}
}
}
}
Here I have this code to try the command timeout in postgres, I have try debug it locally with break point in visual studio. I try both ExecuteNonQuery and ExecuteReader The query took more than 1 second to load all data (I have above 3 millions rows here). But the command timeout is set to 1 second. I wonder why it does not throw any exception here, What did I configured wrong here?
Thank you :)
As #hans-kesting wrote above, the command timeout isn't cumulative for the entire command, but rather for each individual I/O-producing call (e.g. Read). In that sense, it's meant to help with queries running for too long (without producing any results), or network issues.
You may also want to take a look at PostgreSQL's statement_timeout, which is a PG-side timeout for the entire command. It too has its issues, and Npgsql never sets it implicitly for you - but you can set it yourself.
SQL version : SQL Server 2008 R2 Standard Edition
Application : .Net 3.5 (Windows Form)
This is the error i am receiving after running my code
The CLR has been unable to transition from COM context 0xe88270 to COM context 0xe88328 for 60 seconds. The thread that owns the destination context/apartment is most likely either doing a non pumping wait or processing a very long running operation without pumping Windows messages. This situation generally has a negative performance impact and may even lead to the application becoming non responsive or memory usage accumulating continually over time. To avoid this problem, all single threaded apartment (STA) threads should use pumping wait primitives (such as CoWaitForMultipleHandles) and routinely pump messages during long running operations.
Following code produces the above error sometimes. Most probably after 24,000+ records have been inserted it gives the error.
Objective of the code
The code is written to insert dummy entry to test my application
Random rnd = new Random();
string Data = "";
for (int i = 0; i < 2000000; i++)
{
Data = "Insert Into Table1(Field1)values('" + rnd.Next(0, 200000000) + "');" + Environment.NewLine +
"Insert Into Table1(Field1)values('" + rnd.Next(0, 200000000) + "');" + Environment.NewLine +
"Insert Into Table1(Field1)values('" + rnd.Next(0, 200000000) + "');" + Environment.NewLine +
"Insert Into Table1(Field1)values('" + rnd.Next(0, 200000000) + "');" + Environment.NewLine +
"Insert Into Table1(Field1)values('" + rnd.Next(0, 200000000) + "');" + Environment.NewLine +
"Insert Into Table1(Field1)values('" + rnd.Next(0, 200000000) + "');";
ExecuteQuery(Data);//Error is displayed here
}
Code of "ExecuteQuery"
SqlConnection Conn = new SqlConnection("Connection String");
if (Conn == null)
{
Conn = new SqlConnection(Program.ConnString);
}
if (Conn.State == ConnectionState.Closed)
{
Conn.Open();
}
else
{
Conn.Close();
Conn.Open();
}
SqlCommand cmd = new SqlCommand();
cmd.Connection = Conn;
cmd.CommandTimeout = 600000;
cmd.CommandType = CommandType.Text;
cmd.CommandText = strsql;
cmd.ExecuteNonQuery();
Conn.Close();
Note :- I have written multiple insert statements that are exact same in the query that is because it will reduce the no. of queries the SQL has to handle
Question
How can i optimize my code to prevent the error from occuring?
"The CLR has been unable to transition from COM context … to COM context … for 60 seconds. The thread that owns the destination context/apartment is most likely either doing a non pumping wait or processing a very long running operation without pumping Windows messages."
It is not entirely clear from your question how your code works in detail, so I will assume that your application is not multithreaded, i.e. the loop performing 2×6 million database INSERTs is running on the application's main (UI) thread. This will probably take a while (> 60 seconds), so your UI will freeze in the meantime (i.e. make it non-responsive). This is because Windows Forms never gets a chance to run and react to user input while your (blocking) loop is still running. And I bet this is what causes the warning you've cited.
Use a parameterized SQL command instead.
The first easy thing you could do is turn your SqlCommand into a parameterized one. You would then issue commands with the identical SQL text; only the separately provided parameters would differ:
private void InsertRandomNumbers(int[] randomNumbers)
{
const string commandText = "INSERT INTO dbo.Table1 (Field1) VALUES (#field1);"
using (var connection = new SqlConnection(connectionString)
using (var command = new SqlCommand(commandText, connection))
{
var field1Parameter = new SqlDataParameter("#field1", SqlDbType.Int);
command.Parameters.Add(field1Parameter);
connection.Open();
foreach (int randomNumber in randomNumbers)
{
field1Parameter.Value = randomNumber;
/* int rowsAffected = */ command.ExecuteNonQuery();
}
connection.Close();
}
}
Note how commandText is defined as a constant. This is a good indication that SQL Server will also recognise it as always the same command — with parameterized commands, the actual parameter values are provided separately — and SQL Server will only compile and optimize the statement once (and put the compiled statement in its cache so it can be reused subsequently) instead of doing the same thing over and over again. This alone should save a lot of time.
Long-running operations should be asynchronous so your UI doesn't freeze.
Another thing you can do is to shift the database code to a background thread, so that your application's UI won't freeze.
Let's assume that your database INSERT loop is currently triggered by a button doWorkButton. So the for loop is inside a button Click event handler:
private void doWorkButton_Click(object sender, EventArgs e)
{
…
for (i = 0; i < 2000000; i++)
{
Data = …
ExecuteQuery(Data);
// note: I leave it as an exercise to you to combine the
// above suggestion (parameterized queries) with this one.
}
}
The first thing we do is to change your ExecuteQuery method in three ways:
Make it asynchronous. Its return type will be Task instead of void, and its declaration is augmented with the async keyword.
Rename it to ExecuteNonQueryAsync to reflect two things: that it will now be asynchronous, and that it doesn't perform queries. We don't actually expect to get back a result from the database INSERTs.
We rewrite the method body to use of ADO.NET asynchronous methods instead of their synchronous counterparts.
This is what it might end up looking like:
private async Task ExecuteNonQueryAsync(string commandText)
{
using (var connection = new SqlConnection(connectionString)
using (var command = new SqlCommand(commandText, connection))
{
connection.Open();
/* int rowsAffected = */ await command.ExecuteNonQueryAsync();
connection.Close();
}
}
That's it. Now we need to modify the handler method to make it asynchronous as well:
private async void doWorkButton_Click(object sender, EventArgs e)
{
try
{
…
for (i = 0; i < 2000000; i++) { … }
{
Data = …
await ExecuteNonQueryAsync(Data);
}
}
catch
{
… // do not let any exceptions escape this handler method
}
}
This should take care of the UI freezing business.
Millions of INSERT statements can be done more efficiently with SqlBulkCopy.
SQL Server has this wonderful option for bulk inserting data. Making use of this feature often results in much better performance that doing your own batches of INSERT statements.
private async void doWorkButton_Click(object sender, EventArgs e)
{
// Prepare the data to be loaded into your database table.
// Note, this could be done more efficiently.
var dataTable = new DataTable();
{
dataTable.Columns.Add("Field1", typeof(int));
var rnd = new Random();
for (int i = 0; i < 12000000; ++i)
{
dataTable.Rows.Add(rnd.Next(0, 2000000));
}
}
using (var connection = new SqlConnection(connectionString))
using (var bulkCopy = new SqlBulkCopy(connection))
{
bulkCopy.DestinationTableName = "dbo.Table1";
try
{
// This will perform a bulk insert into the table
// mentioned above, using the data passed in as a parameter.
await bulkCopy.WriteToServerAsync(dataTable);
}
catch
{
Console.WriteLine(ex.Message);
}
}
}
I cannot test right now, but I hope this will get you started. If you want to improve the SqlBulkCopy-based solution further, I suggest that you create a custom implementation of IDataReader that creates "rows" containing the random data on-the-fly, then you pass an instance of that to SqlBulkCopy instead of a pre-populated data table. This means that you won't have to keep a huge data table in memory, but only one data row at a time.
When inserting data into database using parallel foreach I get the following error:
The connection pool has been exhausted'
after inserting some amount of data into the database
try
{
var connection = ConfigurationManager.ConnectionStrings["Connection"].ConnectionString;
Parallel.ForEach(Enumerable.Range(0, 1000), (_) =>
{
using (var connectio = new NpgsqlConnection(connection))
{
connectio.Open();
using (var command = new NpgsqlCommand("fn_tetsdata", connectio) { CommandType = CommandType.StoredProcedure })
{
command.Parameters.AddWithValue("firstname", "test");
command.Parameters.AddWithValue("lastname", "test");
command.Parameters.AddWithValue("id", 10);
command.Parameters.AddWithValue("designation", "test");
command.ExecuteNonQuery();
}
connectio.Close();
}
});
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
Constrain the amount of parralelism with MaxDegreeOfParallelism, by default it could be exceeding the number of DB connections you have. Find a balance between parallelising your work and not killing the DB :)
Parallel.ForEach(yourListOfStuff,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
stuff => { YourMethod(stuff); }
);
I assume you're using parallelism to improve performance. If that's the case then first you need a baseline. Run the 1,000 queries in serial, creating a new connection each time (which in reality just pulls one from the pool).
Then try it with the same connection object and see if the performance improves.
Then try it with the came command object, just changing the parameter values.
Then try it in parallel with the same connection co you're not creating 1,000 connection objects, which you've already tried.
I would be surprised if you got a significant performance improvement by using parallelism, since Parallel improves the performance of CPU-bound tasks, and data queries are generally much more bound by I/O than CPU.
I am using Npgsqlconnection inside a Parallel.ForEach, looping through inline queries in a List.
When I reach the number around 1400+ I get an Exception saying
'FATAL: 53300: remaining connection slots are reserved for non-replication superuser connections'.
I am using
Pooling=true;MinPoolSize=1;MaxPoolSize=1024;ConnectionLifeTime=1
in my app.config and con.Close(), con.ClearPool(), con.Dispose() in my code.
Parallel.ForEach(queries, query =>
{
using (NpgsqlConnection con = new NpgsqlConnection(ConfigurationManager.ConnectionStrings["PSQL"].ConnectionString))
{
con.ClearPool();
con.Open();
//int count = 0;
int queryCount = queries.Count;
using (NpgsqlCommand cmd = con.CreateCommand())
{
cmd.CommandType = CommandType.Text;
//cmd.CommandTimeout = 0;
cmd.CommandText = query;
cmd.ExecuteNonQuery();
count += 1;
this.label1.Invoke(new MethodInvoker(delegate { this.label1.Text = String.Format("Processing...\n{0} of {1}.\n{2}% completed.", count, queryCount, Math.Round(Decimal.Divide(count, queryCount) * 100, 2)); }));
}
con.Close();
//con.Dispose();
//con.ClearPool();
}
});
You are hitting the max connection limit of postgresql itself:
http://www.postgresql.org/docs/9.4/static/runtime-config-connection.html#GUC-MAX-CONNECTIONS
Your parallel queries are getting a lot of connections and the server isn't being able to handle it. By default, Postgresql is configured to allow 100 concurrent connections. Maybe you should try to increase this value in your postgresql.conf file.
Another option is to limit the pool size of Npgsql to a lower number. Your concurrent queries would wait when the max pool size is reached.
Also, don't call ClearPool as you would add overhead to the pool logic and wouldn't benefit from the pool at all. You could try setting Pool=false in your connection string instead.
I hope it helps.
I have a table "Links" with some download links.
My .NET application reads this table, takes the link, creates a web client and downloads the associated file.
I want to create several threads that do this, but each one should read a different record, otherwise two threads are trying to download the same file.
How can do this?
I have tried this but it doesn't work:
public static Boolean Get_NextProcessingVideo(ref Int32 idVideo, ref String youtubeId, ref String title)
{
Boolean result = false;
using (NpgsqlConnection conn = new NpgsqlConnection(ConfigurationDB.GetInstance().ConnectionString))
{
conn.Open();
NpgsqlTransaction transaction = conn.BeginTransaction();
String query = "BEGIN WORK; LOCK TABLE links IN ACCESS EXCLUSIVE MODE; SELECT v.idlink, v.title " +
" FROM video v WHERE v.schedulingflag IS FALSE AND v.errorflag IS FALSE ORDER BY v.idvideo LIMIT 1; " +
" COMMIT WORK;";
NpgsqlCommand cmd = new NpgsqlCommand(query, conn, transaction);
NpgsqlDataReader dr = cmd.ExecuteReader();
if (dr.HasRows)
{
dr.Read();
idVideo = Convert.ToInt32(dr["idvideo"]);
title = dr["title"].ToString();
Validate_Scheduling(idVideo); //
result = true;
}
transaction.Commit();
conn.Close();
}
return result;
}
You have a few options here. The one thing you don't want to be doing, as you note, is locking the table.
Advisory locks. The advantage is that these are extra-transactional. The disadvantage is that they are not closed at the transaction and must be closed specifically, and that leakage can eventually cause problems (essentially a shared memory leak on the back-end). Generally speaking I do not like extra-transactional locks like this and while advisory locks are cleared when the db session ends, there are still possible issues with stale locks.
You can have a dedicated thread pull the pending files first, and then delegate specific retrievals to child threads. This is probably the best approach both in terms of db round-trips and simplicity of operation. I would expect that this would perform best of any of the solutions.
You can SELECT FOR UPDATE NOWAIT in a stored procedure which can handle exception handling. See Select unlocked row in Postgresql for an examples.