NpgsqlCopyIn fails by timeout ("CommandTimeout" setting ignored) - c#

I have a quite large dataset (900K records, 140Mb disk space) stored in CSV file in a client app (.NET 4.0). I need to load this data to Postgres 9 db the fastest way. I use Npgsql "NpgsqlCopyIn" technique (Npgsql library version=2.1.0).
For a probe load (138K) insertion works fine - it takes about 7 secons.
But for the whole batch (900K), the code throws timeout exception:
"ERROR: 57014: canceling statement due to statement timeout"
The stack trace is:
Npgsql.NpgsqlState.d_9.MoveNext() at
Npgsql.NpgsqlState.ProcessAndDiscardBackendResponses(NpgsqlConnector
context) at Npgsql.NpgsqlCopyInState.SendCopyDone(NpgsqlConnector
context) at Npgsql.NpgsqlCopyInState.StartCopy(NpgsqlConnector
context, NpgsqlCopyFormat copyFormat) at
Npgsql.NpgsqlState.d_9.MoveNext() at
Npgsql.NpgsqlState.ProcessAndDiscardBackendResponses(NpgsqlConnector
context) at
Npgsql.NpgsqlConnector.ProcessAndDiscardBackendResponses() at
Npgsql.NpgsqlCommand.ExecuteBlind() at Npgsql.NpgsqlCopyIn.Start()
I tried setting CommandTimeout to kilo values(>7200), zero; tried same values for connection "Timeout" parameter. Also I was trying to set "CommandTimeout" via connection string, but still with no result - "ERROR 57014" comes out again and again.
Please, help to load the batch correctly!
Here is the code I use:
private static void pgBulkCopy(string connection_string, FileInfo fiDataFile)
{
using (Npgsql.NpgsqlConnection con = new Npgsql.NpgsqlConnection(connection_string))
{
con.Open();
FileStream ifs = new FileStream(fiDataFile.FullName, FileMode.Open, FileAccess.Read);
string queryString = "COPY schm.Addresses(FullAddress,lat,lon) FROM STDIN;";
NpgsqlCommand cmd = new NpgsqlCommand(queryString, con);
cmd.CommandTimeout = 7200; //7200sec, 120 min, 2 hours
NpgsqlCopyIn copyIn = new NpgsqlCopyIn(cmd, con, ifs);
try{
copyIn.Start();
copyIn.End();
}catch(Exception ex)
{
Console.WriteLine("[DB] pgBulkCopy error: " + ex.Message );
}
finally
{
con.Close();
}
}
}

Npgsql has a bug regarding command timeout and NpgsqlCopyIn handling.
You may test our current master where we had a lot of fixes about command timeout handling.
You can download a copy of the project in our GitHub page: https://github.com/npgsql/Npgsql/archive/master.zip
Please, give it a try and let us know if it works for you.

Related

ASP.NET MVC kills my IIS - too many request

I facing a big issuse. I've build software that now over 100 users use at once.
Since than my MVC ASP.NET Application is dying. The IIS Crashes over 30-40 times a day.
i dont have any recursively code
Main Probleme is that all users fetch a boolean which tell's them if they need to get new data.
That fetch-http method is called 1-5 times a second from diffrent users.
But the SQL-Reader is slower than the request made.
Error Message: There is already an open DataReader associated with this Command which must be closed first.
or
Error Message: Internal connection fatal error. Error state: 15, Token : 97
My method:
[HttpGet]
[Route("fetch")]
public IHttpActionResult fetch()
{
SqlConnection connection = new SqlConnection(con);
string sql = "SELECT whentime FROM did_datachange WHERE ux_admin_id = " + id;
connection.Open();
using (SqlCommand command = new SqlCommand(sql, connection))
{
using (SqlDataReader reader = command.ExecuteReader())
{
while (reader.Read())
{
if (reader.FieldCount == 1)
dateTime = reader.GetDateTime(0);
}
}
}
connection.Close();
....following code...
}
i know that lock cloud solve that problem, but using lock slows down the code significantly. what can i do?
it was bad code (multi threading that threw alot of execeptions)

Can't select more than 700000 rows from SQL Server using C#

I couldn't fetch more than 700000 rows from SQL Server using C# - I get a "out-of-memory" exception. Please help me out.
This is my code:
using (SqlConnection sourceConnection = new SqlConnection(constr))
{
sourceConnection.Open();
SqlCommand commandSourceData = new SqlCommand("select * from XXXX ", sourceConnection);
reader = commandSourceData.ExecuteReader();
}
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(constr2))
{
bulkCopy.DestinationTableName = "destinationTable";
try
{
// Write from the source to the destination.
bulkCopy.WriteToServer(reader);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
finally
{
reader.Close();
}
}
I have made up small console App based on the given solution 1 but ends up with same exception also i have posted my Memory process Before and After
Before Processing:
After adding the command timeout at the read code side, Ram Peaks up,
That code should not cause an OOM exception. When you pass a DataReader to SqlBulkCopy.WriteToServer you are streaming the rows from the source to the destination. Somewhere else you are retaining stuff in memory.
SqlBulkCopy.BatchSize controls how often SQL Server commits the rows loaded at the destination, limiting the lock duration and the log file growth (if not minimally logged and in simple recovery mode). Whether you use one batch or not should have no impact on the amount of memory used either in SQL Server or in the client.
Here's a sample that copies 10M rows without growing memory:
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace SqlBulkCopyTest
{
class Program
{
static void Main(string[] args)
{
var src = "server=localhost;database=tempdb;integrated security=true";
var dest = src;
var sql = "select top (1000*1000*10) m.* from sys.messages m, sys.messages m2";
var destTable = "dest";
using (var con = new SqlConnection(dest))
{
con.Open();
var cmd = con.CreateCommand();
cmd.CommandText = $"drop table if exists {destTable}; with q as ({sql}) select * into {destTable} from q where 1=2";
cmd.ExecuteNonQuery();
}
Copy(src, dest, sql, destTable);
Console.WriteLine("Complete. Hit any key to exit.");
Console.ReadKey();
}
static void Copy(string sourceConnectionString, string destinationConnectionString, string query, string destinationTable)
{
using (SqlConnection sourceConnection = new SqlConnection(sourceConnectionString))
{
sourceConnection.Open();
SqlCommand commandSourceData = new SqlCommand(query, sourceConnection);
var reader = commandSourceData.ExecuteReader();
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(destinationConnectionString))
{
bulkCopy.BulkCopyTimeout = 60 * 10;
bulkCopy.DestinationTableName = destinationTable;
bulkCopy.NotifyAfter = 10000;
bulkCopy.SqlRowsCopied += (s, a) =>
{
var mem = GC.GetTotalMemory(false);
Console.WriteLine($"{a.RowsCopied:N0} rows copied. Memory {mem:N0}");
};
// Write from the source to the destination.
bulkCopy.WriteToServer(reader);
}
}
}
}
}
Which outputs:
. . .
9,830,000 rows copied. Memory 1,756,828
9,840,000 rows copied. Memory 798,364
9,850,000 rows copied. Memory 4,042,396
9,860,000 rows copied. Memory 3,092,124
9,870,000 rows copied. Memory 2,133,660
9,880,000 rows copied. Memory 1,183,388
9,890,000 rows copied. Memory 3,673,756
9,900,000 rows copied. Memory 1,601,044
9,910,000 rows copied. Memory 3,722,772
9,920,000 rows copied. Memory 1,642,052
9,930,000 rows copied. Memory 3,763,780
9,940,000 rows copied. Memory 1,691,204
9,950,000 rows copied. Memory 3,812,932
9,960,000 rows copied. Memory 1,740,356
9,970,000 rows copied. Memory 3,862,084
9,980,000 rows copied. Memory 1,789,508
9,990,000 rows copied. Memory 3,903,044
10,000,000 rows copied. Memory 1,830,468
Complete. Hit any key to exit.
NB: Per DavidBrowne's answer, it seems I'd misunderstood how the batching of the SqlBulkCopy class works. The refactored code may still be useful to you, so I've not deleted this answer (as the code is still valid), but the answer is not to set the BatchSize as I'd believed. Please see David's answer for an explanation.
Try something like this; the key being setting the BatchSize property to limit how many rows you deal with at once:
using (SqlConnection sourceConnection = new SqlConnection(constr))
{
sourceConnection.Open();
SqlCommand commandSourceData = new SqlCommand("select * from XXXX ", sourceConnection);
using (reader = commandSourceData.ExecuteReader() { //add a using statement for your reader so you don't need to worry about close/dispose
//keep the connection open or we'll be trying to read from a closed connection
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(constr2))
{
bulkCopy.BatchSize = 1000; //Write a few pages at a time rather than all at once; thus lowering memory impact. See https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlbulkcopy.batchsize?view=netframework-4.7.2
bulkCopy.DestinationTableName = "destinationTable";
try
{
// Write from the source to the destination.
bulkCopy.WriteToServer(reader);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
throw; //we've caught the top level Exception rather than somethign specific; so once we've logged it, rethrow it for a proper handler to deal with up the call stack
}
}
}
}
Note that because the SqlBulkCopy class takes an IDataReader as an argument we don't need to download the full data set. Instead, the reader gives us a way to pull back records as required (hence us leaving the connection open after creating the reader). When we call the SqlBulkCopy's WriteToServer method, internally it has logic to loop multiple times, selecting BatchSize new records from the reader, then pushing those to the destination table before repeating / completing once the reader has sent all pending records. This works differently to, say, a DataTable, where we'd have to populate the data table with the full set of records, rather than being able to read more back as required.
One potential risk of this approach is, because we have to keep the connection open, any locks on our source are kept in place until we close our reader. Depending on the isolation level and whether other queries are trying to access the same records, this may cause blocking; whilst the data table approach would have taken a one-off copy of the data into memory and then closed the connection, avoiding any blocks. If this blocking is a concern you should look at changing the isolation level of your query, or applying hints... Exactly how you approach that would depend on the requirements though.
NB: In reality, instead of running the above code as is, you'd want to refactor things a bit, so the scope of each method is contained. That way you can reuse this logic to copy other queries to other tables.
You'd also want to make the batch size configurable rather than hard-coded so you can adjust to a value that gives a good balance of resource usage vs performance (which will vary based on the host's resources).
You may also want to use async methods, to allow other parts of your program to progress whilst you're waiting on data to flow from/to your databases.
Here's a slightly amended version:
public Task<SqlDataReader> async ExecuteReaderAsync(string connectionString, string query)
{
SqlConnection connection;
SqlCommand command;
try
{
connection = new SqlConnection(connectionString); //not in a using as we want to keep the connection open until our reader's finished with it.
connection.Open();
command = new SqlCommand(query, connection);
return await command.ExecuteReaderAsync(CommandBehavior.CloseConnection); //tell our reader to close the connection when done.
}
catch
{
//if we have an issue before we've returned our reader, dispose of our objects here
command?.Dispose();
connection?.Dispose();
//then rethrow the exception
throw;
}
}
public async Task CopySqlDataAsync(string sourceConnectionString, string sourceQuery, string destinationConnectionString, string destinationTableName, int batchSize)
{
using (var reader = await ExecuteReaderAsync(sourceConnectionString, sourceQuery))
await CopySqlDataAsync(reader, destinationConnectionString, destinationTableName, batchSize);
}
public async Task CopySqlDataAsync(IDataReader sourceReader, string destinationConnectionString, string destinationTableName, int batchSize)
{
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(destinationConnectionString))
{
bulkCopy.BatchSize = batchSize;
bulkCopy.DestinationTableName = destinationTableName;
await bulkCopy.WriteToServerAsync(sourceReader);
}
}
public void CopySqlDataExample()
{
try
{
var constr = ""; //todo: define connection string; ideally pulling from config
var constr2 = ""; //todo: define connection string #2; ideally pulling from config
var batchSize = 1000; //todo: replace hardcoded batch size with value from config
var task = CopySqlDataAsync(constr, "select * from XXXX", constr2, "destinationTable", batchSize);
task.Wait(); //waits for the current task to complete / if any exceptions will throw an aggregate exception
}
catch (AggregateException es)
{
var e = es.InnerExceptions[0]; //get the wrapped exception
Console.WriteLine(e.Message);
//throw; //to rethrow AggregateException
ExceptionDispatchInfo.Capture(e).Throw(); //to rethrow the wrapped exception
}
}
Something went horribly wrong in your design if you even try to process 700k Rows in C#. That you fail at this is to be expected.
If this is data retrieval for display: There is no way the user will be able to process that amount of data. And filtering down from 700k Rows in the GUI is just a waste of time and Bandwidth. 25-100 fields at once is about the limit. Do filtering or pagination on the Query side so you do not end up retrieving orders of magnitude more then you can actually process.
If this is some form of Bulk insert or Bulk modification: Do that kind of operation in the SQL Server, not in your code. Retrieving, processing in C# and then posting back just adds layers of Overhead. If you add the 2 way Network transfer, you will easily triple the time this will take.

"An operation is already in progress" on DataAdaper.Fill

I am trying the Postgres Plus 9.5 with .Net 4.5, Npgsql 3.1.6 NuGet package.
I have read what is here about this error, but I do not understand why I get it. Everything is disposed. Here the is code:
public override DataTable getActListData(int FunkNr)
{
using (var cmd = new NpgsqlCommand())
{
cmd.CommandText = npgsqlCommand3.CommandText;
cmd.Connection = this.npgsqlConnection;
cmd.Parameters.Add(new Npgsql.NpgsqlParameter("ANWENDUNG", NpgsqlTypes.NpgsqlDbType.Numeric));
cmd.Parameters.Add(new Npgsql.NpgsqlParameter("XFUNKNR", NpgsqlTypes.NpgsqlDbType.Numeric));
using (var da = new NpgsqlDataAdapter(npgsqlCommand3))
{
var tab = new DataTable();
da.SelectCommand.Parameters["ANWENDUNG"].Value = getAnwendung();
da.SelectCommand.Parameters["XFUNKNR"].Value = FunkNr;
da.Fill(tab); // Here is the error on the 5th call
return tab;
}
}
}
Does this problem comes from Npgsql or is from Postgres?
Some other questions:
I have read here, that lazy loading is impossible, but I didn't understand is is because of the Npgsql or from Postgres?
Is it possible in Postgres to open several cursors and read on demand in the same connection?
Edit: Changed the code:
using (var npgsqlConnection = new NpgsqlConnection())
{
ConnectionString = string.Format(DataClientFactory.DataBaseConnectString, DB, User, PW);
npgsqlConnection.ConnectionString = ConnectionString;
npgsqlConnection.Open();
....
the code above here
....
}
The same error in the same call. The error:
System.InvalidOperationException occurred
HResult=-2146233079
Message=An operation is already in progress.
Source=Npgsql
StackTrace:
bei Npgsql.NpgsqlConnector.StartUserAction(ConnectorState newState)
InnerException:
If the piece of code you posted runs concurrently on the same connection, then that's your problem - Npgsql connections aren't thread-safe, and it's not possible to have multiple readers opened at the same time (MARS).

Fatal error encountered attempting to read the resultset

I'm attempting to read a file in to the database that I have, but I'm receiving the above mentioned error. After checking to make sure that all the information was correct (i.e. nothing trying to be put into the database that doesn't have a column and such) I sill receive the error. Next I ran it my mysqlworkbench and it works fine. Now I'm not sure where to go to next I can't check the error any deeper considering its throwing on the ExecuteNonQuery. The file that I'm importing is 3000 lines. When I shorten the file it succeeded. I guess the first question is to ask is there some sort of max connection time and if so do I need to shorten my file or lengthen the time it can stay connected? Any advice would be great, thanks. Also I use the ,, to terminate the fields because some of the data can have a , in it.
public void sendQuery(string filePath, int month)
{
conn.Open();
string monthName;
monthName = CultureInfo.CurrentCulture.DateTimeFormat.GetMonthName(month);
monthName = monthName.ToLower();
monthName = monthName.Substring(0, 3);
try
{
command.CommandText = "SET SQL_SAFE_UPDATES = 0;" +
"load data local infile 'C:/Users/mem-joshuad/Desktop/temp.txt' into table "+monthName+" fields terminated by ',,' lines terminated by '\r\n'";
command.ExecuteNonQuery();
}
catch (MySqlException ex)
{
ScreenNavigation n = new ScreenNavigation();
n.homeScreen(ex.Message);
}
conn.Close();
}
All I needed to do was lengthen the default time out. See comments for the process.

Sqlite database locked

I'm using asp.net c# and upload a SqLite database to a server and then I do some inserting and updating. The problem is that sometimes (I think it's when somethings go wrong with the updating or so) the database gets locked. So the next time I try to upload a file again it's locked and I get an error saying "The process cannot access the file because it is being used by another process". Maybe the database file isn't disposed if something goes wrong during the transaction? The only thing to solve this problem is restarting the server.
How can I solve it in my code so I can be sure it's always unlocked even if something goes wrong?
This is my code:
try
{
string filepath = Server.MapPath("~/files/db.sql");
//Gets the file and save it on the server
((HttpPostedFile)HttpContext.Current.Request.Files["sqlitedb"]).SaveAs(filepath);
//Open the database
SQLiteConnection conn = new SQLiteConnection("Data Source=" + filepath + ";Version=3;");
conn.Open();
SQLiteCommand cmd = new SQLiteCommand(conn);
using (SQLiteTransaction transaction = conn.BeginTransaction())
{
using (cmd)
{
//Here I do some stuff to the database, update, insert etc
}
transaction.Commit();
}
conn.Close();
cmd.Dispose();
}
catch (Exception exp)
{
//Error
}
You could try placing the Connection in a using block as well, or calling Dispose on it:
//Open the database
using (SQLiteConnection conn = new SQLiteConnection("Data Source=" + filepath + ";Version=3;")) {
conn.Open();
using (SQLiteCommand cmd = new SQLiteCommand(conn)) {
using (SQLiteTransaction transaction = conn.BeginTransaction()) {
//Here I do some stuff to the database, update, insert etc
transaction.Commit();
}
}
}
This will ensure that you're disposing of the connection object's correctly (you're not at the moment, only closing it).
Wrapping them in using blocks ensures that Dispose is called even if an exception happens - it's effectively the same as writing:
// Create connection, command, etc objects.
SQLiteConnection conn;
try {
conn = new SQLiteConnection("Data Source=" + filepath + ";Version=3;");
// Do Stuff here...
}
catch (exception e) {
// Although there are arguments to say don't catch generic exceptions,
// but instead catch each explicit exception you can handle.
}
finally {
// Check for null, and if not, close and dispose
if (null != conn)
conn.Dispose();
}
The code in the finally block is going to be called regardless of the exception, and helps you clean up.
An asp.net application is multithreaded in the server.
You can't do simultaneous writing (insert, select, update...) because the whole db is locked. Simultaneously selecting is allowed when no writing is happening.
You should use the .NET ReaderWriterLock class: http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlock.aspx
Shouldn't you do cmd.Dispose() before conn.Close()? I don't know if it makes any difference, but you generally want to clean things up in the opposite of initialization order.
In short, SQLite handles unmanaged resources slightly differently than other providers. You'll have to explicitly dispose the command (which seems to work even if you are working with the reader outside of the using() block.
Read this thread for more flavor:
http://sqlite.phxsoftware.com/forums/p/909/4164.aspx

Categories

Resources