Lock table in PostgreSQL - c#

I have a table "Links" with some download links.
My .NET application reads this table, takes the link, creates a web client and downloads the associated file.
I want to create several threads that do this, but each one should read a different record, otherwise two threads are trying to download the same file.
How can do this?
I have tried this but it doesn't work:
public static Boolean Get_NextProcessingVideo(ref Int32 idVideo, ref String youtubeId, ref String title)
{
Boolean result = false;
using (NpgsqlConnection conn = new NpgsqlConnection(ConfigurationDB.GetInstance().ConnectionString))
{
conn.Open();
NpgsqlTransaction transaction = conn.BeginTransaction();
String query = "BEGIN WORK; LOCK TABLE links IN ACCESS EXCLUSIVE MODE; SELECT v.idlink, v.title " +
" FROM video v WHERE v.schedulingflag IS FALSE AND v.errorflag IS FALSE ORDER BY v.idvideo LIMIT 1; " +
" COMMIT WORK;";
NpgsqlCommand cmd = new NpgsqlCommand(query, conn, transaction);
NpgsqlDataReader dr = cmd.ExecuteReader();
if (dr.HasRows)
{
dr.Read();
idVideo = Convert.ToInt32(dr["idvideo"]);
title = dr["title"].ToString();
Validate_Scheduling(idVideo); //
result = true;
}
transaction.Commit();
conn.Close();
}
return result;
}

You have a few options here. The one thing you don't want to be doing, as you note, is locking the table.
Advisory locks. The advantage is that these are extra-transactional. The disadvantage is that they are not closed at the transaction and must be closed specifically, and that leakage can eventually cause problems (essentially a shared memory leak on the back-end). Generally speaking I do not like extra-transactional locks like this and while advisory locks are cleared when the db session ends, there are still possible issues with stale locks.
You can have a dedicated thread pull the pending files first, and then delegate specific retrievals to child threads. This is probably the best approach both in terms of db round-trips and simplicity of operation. I would expect that this would perform best of any of the solutions.
You can SELECT FOR UPDATE NOWAIT in a stored procedure which can handle exception handling. See Select unlocked row in Postgresql for an examples.

Related

Cancelling SQL Query in background in .NET

I'm designing a small desktop app that fetches data from SQL server. I used BackgroundWorker to make the query execute in background. The code that fetches data generally comes down to this:
public static DataTable GetData(string sqlQuery)
{
DataTable t = new DataTable();
using (SqlConnection c = new SqlConnection(GetConnectionString()))
{
c.Open();
using (SqlCommand cmd = new SqlCommand(sqlQuery))
{
cmd.Connection = c;
using (SqlDataReader r = cmd.ExecuteReader())
{
t.Load(r);
}
}
}
return t;
}
Since query can take up 10-15 minutes I want to implement cancellation request and pass it from GUI layer to DAL. Cancellation procedure of BackroundWorker won't let me cancel SqlCommand.ExecuteReader() beacuse it only stops when data is fetched from server or an exception is thrown by Data Provider.
I tried to use Task and async/await with SqlCommand.ExecuteReaderAsync(CancellationToken) but I am confused where to use it in multi-layer app (GUI -> BLL -> DAL).
Have you tried using the SqlCommand.Cancel() method ?
Aproach: encapsulate that GetData method in a Thread/Worker and then when you cancel/stop that thread call the Cancel() method on the SqlCommand that is being executed.
Here is an example on how to use it on a thread
using System;
using System.Data;
using System.Data.SqlClient;
using System.Threading;
class Program
{
private static SqlCommand m_rCommand;
public static SqlCommand Command
{
get { return m_rCommand; }
set { m_rCommand = value; }
}
public static void Thread_Cancel()
{
Command.Cancel();
}
static void Main()
{
string connectionString = GetConnectionString();
try
{
using (SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
Command = connection.CreateCommand();
Command.CommandText = "DROP TABLE TestCancel";
try
{
Command.ExecuteNonQuery();
}
catch { }
Command.CommandText = "CREATE TABLE TestCancel(co1 int, co2 char(10))";
Command.ExecuteNonQuery();
Command.CommandText = "INSERT INTO TestCancel VALUES (1, '1')";
Command.ExecuteNonQuery();
Command.CommandText = "SELECT * FROM TestCancel";
SqlDataReader reader = Command.ExecuteReader();
Thread rThread2 = new Thread(new ThreadStart(Thread_Cancel));
rThread2.Start();
rThread2.Join();
reader.Read();
System.Console.WriteLine(reader.FieldCount);
reader.Close();
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
static private string GetConnectionString()
{
// To avoid storing the connection string in your code,
// you can retrieve it from a configuration file.
return "Data Source=(local);Initial Catalog=AdventureWorks;"
+ "Integrated Security=SSPI";
}
}
You can only do Cancelation checking and Progress Reporting between Distinct lines of code. Usually both require that you disect the code down to the lowest loop level, so you can do both these things between/in the loop itterations. When I wrote my first step into BGW, I had the advantage that I needed to do the loop anyway so it was no extra work. You have one of the worse cases - pre-existing code that you can only replicate or use as is.
Ideal case:
This operation should not take nearly as long is it does. 5-10 minutes indicates that there is something rather wrong with your design.
If the bulk of the time is transmission of data, then you are propably retreiving way to much data. Retrieving everything to do filtering in the GUI is a very common mistake. Do as much filtering in the query as possible. Usign a Distributed Database might also help with transmission performance.
If the bulk of the time is processing as part of the query operation (complex Conditions), something in your general approach might have to change. There are various ways to trade off complex calculation with a bit of memory on the DBMS side. Views afaik can cache the results of operations, while still maintaining transactional consistency.
But it really depends what your backend DB/DBMS and use case are. A lot of the use SQL as Query Language. So it does not allow us to predict wich options you have.
Second best case:
The second best thing if you can not cut it down, would be if you had the actually DB access code down to the lowest loop and would do progress reporting/cancelation checking on it. That way you could actually use the existing Cancelation Token System inherent in BGW.
Everything else
Using any other approach to Cancelation is really a fallback. I wrote a lot on why it is bad, but felt that this might work better if I focus on the core issue - likely something wrong in design of he DB and/or Query. Because those might well eliminate the issue altogether.

What pattern to employ for locking SQL Server record while editing its data

I'm having trouble developing the right strategy for opening connections, beginning transactions, and committing/rolling back/closing connections. The context is an ASP.NET WebForms application. A client can bring up a record to make edits, and I'd like other clients to be locked out of updating that record during the edit operation.
Other clients should be able to read the last committed version of that record. The approach I'm using right now for an edit operation is to open a connection and begin a transaction at IsolationLevel.RepeatableRead, which is doing what I want in terms of locking.
However, I'm not immediately closing the connection...instead, I keep the connection and transaction open while the client is actively editing the values in the record so that I hold the lock. Once the edits are done and the client either commits or rolls back the changes, then I close the connection.
Here's the bones of the class that represents a record in the database:
public class DBRecord : IDisposable
{
private OleDbTransaction tran; // holds the open transaction
private Dictionary<string, object> values = new Dictionary<string, object>();
private bool disposedValue = false;
public DBRecord (bool forUpdate) {
OleDbConnection conn = new OleDbConnection(<connection string>);
try {
conn.Open();
tran = conn.BeginTransaction (forUpdate ? IsolationLevel.RepeatableRead : IsolationLevel.ReadCommitted);
OleDbCommand comm = new OleDbCommand("SET XACT_ABORT ON", conn, tran);
comm.ExecuteNonQuery();
comm = new OleDbCommand("SELECT * FROM " + DBTable + " WHERE KEY = ?", conn, tran);
comm.Parameters.Add(new OleDbParameter("#Key", Key));
using (OleDbDataReader rdr = comm.ExecuteReader())
{
while (rdr.Read())
{
for (int i = 0; i < rdr.FieldCount; i++)
{
mvoValues.Add(rdr.GetName(i), rdr.GetValue(i));
}
}
}
} catch {
conn.Close();
throw;
}
if (!forUpdate) {
// don't need to keep the record locked
tran.Commit();
conn.Close();
}
}
public UpdateField(string field, object newValue) {
// this is only called if the object was instantiated with forUpdate true
OleDbCommand comm = new OleDbCommand("UPDATE " + DBTable + " SET " + field + " = ? WHERE " + KeyField + " = ?", tran.Connection, tran);
comm.Parameters.Add(new OleDbParameter("#Value", newValue));
comm.Parameters.Add(new OleDbParameter("#Key", Key));
try {
oCommand.ExecuteNonQuery();
} catch {
OleDbConnection conn = tran.Connection;
tran.Rollback();
conn.Close();
}
}
public void Commit()
{
OleDbConnection conn = tran.Connection;
tran.Commit();
conn.Close();
}
public void Rollback()
{
OleDbConnection conn = tran.Connection;
tran.Rollback();
conn.Close();
}
protected virtual void Dispose(bool disposing)
{
if (!disposedValue)
{
if (disposing)
{
if ((tran != null) && (tran.Connection != null))
{
try
{
OleDbConnection conn = tran.Connection;
/// release rowlocks we acquired at instantiation
tran.Rollback();
conn.Close();
}
catch (Exception)
{
// since we're disposing of the object at this point, there's not much
// we can do about a rollback failure, so silently ignore exceptions
}
}
}
disposedValue = true;
}
}
}
The database has "Allow Snapshot Isolation" and "Is Read Committed Snapshot On" set to true, and the connection string specifies MARS Enabled is true.
It's pretty clear to me that this isn't the right approach:
It goes against everything I've read about connections--open as late as possible and close as quickly as possible. However, I don't know of another way to keep the record locked from updates while a client is making edits.
The bigger problem I'm having is when I terminate the browser while in the middle of an edit. Right now I'm in development, so I'm using VS 2015 and IIS Express. When I close the browser, the IIS Express process closes also. It appears that my Dispose() is never being called, because the processes remain open in my SQL Server Express instance. Clearly I need to somehow guarantee that the connections will close regardless of whether the server stops.
In a previous question related to databases I received a suggestion to cache a copy of the record data in local storage, track updates on that, then execute a new transaction when ready to commit the changes. That would definitely make my connections shorter and the update operation more atomic, but the record would not be locked. The solution there might be to create a "I'm being updated" field that a client could test-and-set at the beginning of an edit operation, but that seems like a hack, especially given that the database engine provides mechanisms for locking.
Intuitively what I'm trying to do here seems like a common enough use case that there should be a pattern already, but I'm apparently not asking Google the right questions because all I'm getting in my searches is how to employ a using block. Can anyone point me to the right way to do this? Thanks.
The correct pattern name is pessimistic offline locking for what you're looking for, https://dzone.com/articles/practical-php-patterns/practical-php-patterns-13 , which is what you were referring to about having a column with "I'm being updated".
You can review other solutions for locking, https://petermeinl.wordpress.com/2011/03/14/db-concurrency-control-patterns-for-applications/ .
It's generally recommended that you use optimistic locking instead of pessimistic locking for multiple reasons. It does not require you to update the row prior to editing, it does not require you to leave a connection open like some of the pessimistic locking solutions, and it does not require implementing lock timeouts. It does however have the side effect that it's possible when the user goes to save their changes that they will be prompted if they want to overwrite or merge with changes that have happened since they began their edit.
These are not really the same as the locking built into SQL Server, although some of SQL Server's features can be used to help implement them.

ADO.NET hurting perfomance: how can I prevent DataReader from taking so long?

An Oracle Stored Procedure runs in the database and returns some rows of data, taking 30 sec.
Now, calling this procedure and filling the DataAdapter to then populate the DataSet takes 1m40s in a C# .NET application.
Testing, noticed that using a DataReader and reading secuentially with the Read() function after calling the stored procedure, takes again a total time of 1m40s aprox.
Any idea what could be causing these bottlenecks and how to get rid of them?
Thanks in advance!
Edit: Added the code
OracleConnection oracleConnection = Connect();
OracleCommand oracleCommand = CreateCommand(2);
OracleDataReader oracleDataReader = null;
if (oracleConnection != null)
{
try
{
if (oracleConnection.State == ConnectionState.Open)
{
oracleCommand.Connection = oracleConnection;
oracleDataReader = oracleCommand.ExecuteReader();
DateTime dtstart = DateTime.Now;
if (oracleDataReader.HasRows)
{
while (oracleDataReader.Read())
{
/* big Bottleneck here ... */
// Parse the fields
}
}
DateTime dtEnd = DateTime.Now;
TimeSpan ts = new TimeSpan(dtEnd.Ticks - dtstart.Ticks);
lblDuration2.Text = "Duration: " + ts.ToString();
Disconnect(oracleConnection);
}
This might help, though the lack of information on how you're actually using the reader.
using (var cnx = Connect())
using (var cmd = CreateCommand(2)) {
try {
if (cnx.State == ConnectionState.Close) cnx.Open()
// The following line allows for more time to be allowed to
// the command execution. The smaller the amount, the sooner the
// command times out. So be sure to let enough room for the
// command to execute successfuly
cmd.CommandTimeout = 600;
// The below-specified CommandBehavior allows for a sequential
// access against the underlying database. This means rows are
// streamed through your reader instance and meanwhile the
// program reads from the reader, the reader continues to read
// from the database instead of waiting until the full result
// set is returned by the database to continue working on the
// information data.
using (var reader = cmd.ExecuteReader(
CommandBehavior.SequentialAccess)) {
if (reader.HasRows)
while (reader.Read()) {
// Perhaps bottleneck will disappear here...
// Without proper code usage of your reader
// no one can help.
}
}
} catch(OracleException ex) {
// Log exception or whatever,
// otherwise might be best to let go and rethrow
} finally {
if (cnx.State == ConnectionState.Open) cnx.Close();
}
}
For more detailed information on command behaviours: Command Behavior Enumeration.
Directly from MSDN:
Sequential Access
Provides a way for the DataReader to handle rows that contain columns with large binary values. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. You can then use the GetBytes or GetChars method to specify a byte location to start the read operation, and a limited buffer size for the data being returned.
When you specify SequentialAccess, you are required to read from the columns in the order they are returned, although you are not required to read each column. Once you have read past a location in the returned stream of data, data at or before that location can no longer be read from the DataReader. When using the OleDbDataReader, you can reread the current column value until reading past it. When using the SqlDataReader, you can read a column value only once.
As for increasing the CommandTimeout property, have a look at this post:
Increasing the Command Timeout for SQL command
When you expect the command to take a certain amount of time, one shall require a longer timeout and allow for the command to return before it times out. When a timeout occurs, it takes a few seconds to be resume from it. All this may be avoided. You might want to measure the time required for the timeout and specify it as close as possible to the real command timeout requirement, as a longer timeout might incur some other underlying problems which won't be detected with a too long timeout. When a command timeout occurs, ask yourself how you could deal with a smaller result set, or how you could improve your query to run faster.

Using SQL Server application locks to solve locking requirements

I have a large application based on Dynamics CRM 2011 that in various places has code that must query for a record based upon some criteria and create it if it doesn't exist else update it.
An example of the kind of thing I am talking about would be similar to this:
stk_balance record = context.stk_balanceSet.FirstOrDefault(x => x.stk_key == id);
if(record == null)
{
record = new stk_balance();
record.Id = Guid.NewGuid();
record.stk_value = 100;
context.AddObject(record);
}
else
{
record.stk_value += 100;
context.UpdateObject(record);
}
context.SaveChanges();
In terms of CRM 2011 implementation (although not strictly relevant to this question) the code could be triggered from synchronous or asynchronous plugins. The issue is that the code is not thread safe, between checking if the record exists and creating it if it doesn't, another thread could come in and do the same thing first resulting in duplicate records.
Normal locking methods are not reliable due to the architecture of the system, various services using multiple threads could all be using the same code, and these multiple services are also load balanced across multiple machines.
In trying to find a solution to this problem that doesn't add massive amounts of extra complexity and doesn't compromise the idea of not having a single point of failure or a single point where a bottleneck could occur I came across the idea of using SQL Server application locks.
I came up with the following class:
public class SQLLock : IDisposable
{
//Lock constants
private const string _lockMode = "Exclusive";
private const string _lockOwner = "Transaction";
private const string _lockDbPrincipal = "public";
//Variable for storing the connection passed to the constructor
private SqlConnection _connection;
//Variable for storing the name of the Application Lock created in SQL
private string _lockName;
//Variable for storing the timeout value of the lock
private int _lockTimeout;
//Variable for storing the SQL Transaction containing the lock
private SqlTransaction _transaction;
//Variable for storing if the lock was created ok
private bool _lockCreated = false;
public SQLLock (string lockName, int lockTimeout = 180000)
{
_connection = Connection.GetMasterDbConnection();
_lockName = lockName;
_lockTimeout = lockTimeout;
//Create the Application Lock
CreateLock();
}
public void Dispose()
{
//Release the Application Lock if it was created
if (_lockCreated)
{
ReleaseLock();
}
_connection.Close();
_connection.Dispose();
}
private void CreateLock()
{
_transaction = _connection.BeginTransaction();
using (SqlCommand createCmd = _connection.CreateCommand())
{
createCmd.Transaction = _transaction;
createCmd.CommandType = System.Data.CommandType.Text;
StringBuilder sbCreateCommand = new StringBuilder();
sbCreateCommand.AppendLine("DECLARE #res INT");
sbCreateCommand.AppendLine("EXEC #res = sp_getapplock");
sbCreateCommand.Append("#Resource = '").Append(_lockName).AppendLine("',");
sbCreateCommand.Append("#LockMode = '").Append(_lockMode).AppendLine("',");
sbCreateCommand.Append("#LockOwner = '").Append(_lockOwner).AppendLine("',");
sbCreateCommand.Append("#LockTimeout = ").Append(_lockTimeout).AppendLine(",");
sbCreateCommand.Append("#DbPrincipal = '").Append(_lockDbPrincipal).AppendLine("'");
sbCreateCommand.AppendLine("IF #res NOT IN (0, 1)");
sbCreateCommand.AppendLine("BEGIN");
sbCreateCommand.AppendLine("RAISERROR ( 'Unable to acquire Lock', 16, 1 )");
sbCreateCommand.AppendLine("END");
createCmd.CommandText = sbCreateCommand.ToString();
try
{
createCmd.ExecuteNonQuery();
_lockCreated = true;
}
catch (Exception ex)
{
_transaction.Rollback();
throw new Exception(string.Format("Unable to get SQL Application Lock on '{0}'", _lockName), ex);
}
}
}
private void ReleaseLock()
{
using (SqlCommand releaseCmd = _connection.CreateCommand())
{
releaseCmd.Transaction = _transaction;
releaseCmd.CommandType = System.Data.CommandType.StoredProcedure;
releaseCmd.CommandText = "sp_releaseapplock";
releaseCmd.Parameters.AddWithValue("#Resource", _lockName);
releaseCmd.Parameters.AddWithValue("#LockOwner", _lockOwner);
releaseCmd.Parameters.AddWithValue("#DbPrincipal", _lockDbPrincipal);
try
{
releaseCmd.ExecuteNonQuery();
}
catch {}
}
_transaction.Commit();
}
}
I would use this in my code to create a SQL Server application lock using the unique key I am querying for as the lock name like this
using (var sqlLock = new SQLLock(id))
{
//Code to check for and create or update record here
}
Now this approach seems to work, however I am by no means any kind of SQL Server expert and am wary about putting this anywhere near production code.
My question really has 3 parts
1. Is this a really bad idea because of something I haven't considered?
Are SQL Server application locks completely unsuitable for this purpose?
Is there a maximum number of application locks (with different names) you can have at a time?
Are there performance considerations if a potentially large number of locks are created?
What else could be an issue with the general approach?
2. Is the solution actually implemented above any good?
If SQL Server application locks are usable like this, have I actually used them properly?
Is there a better way of using SQL Server to achieve the same result?
In the code above I am getting a connection to the Master database and creating the locks in there. Does that potentially cause other issues? Should I create the locks in a different database?
3. Is there a completely alternative approach that could be used that doesn't use SQL Server application locks?
I can't use stored procedures to create and update the record (unsupported in CRM 2011).
I don't want to add a single point of failure.
You can do this much easier.
//make sure your plugin runs within a transaction, this is the case for stage 20 and 40
//you can check this with IExecutionContext.IsInTransaction
//works not with offline plugins but works within CRM Online (Cloud) and its fully supported
//also works on transaction rollback
var lockUpdateEntity = new dummy_lock_entity(); //simple technical entity with as many rows as different lock barriers you need
lockUpdateEntity.Id = Guid.parse("well known guid"); //well known guid for this barrier
lockUpdateEntity.dummy_field=Guid.NewGuid(); //just update/change a field to create a lock, no matter of its content
//--------------- this is untested by me, i use the next one
context.UpdateObject(lockUpdateEntity);
context.SaveChanges();
//---------------
//OR
//--------------- i use this one, but you need a reference to your OrganizationService
OrganizationService.Update(lockUpdateEntity);
//---------------
//threads wait here if they have no lock for dummy_lock_entity with "well known guid"
stk_balance record = context.stk_balanceSet.FirstOrDefault(x => x.stk_key == id);
if(record == null)
{
record = new stk_balance();
//record.Id = Guid.NewGuid(); //not needed
record.stk_value = 100;
context.AddObject(record);
}
else
{
record.stk_value += 100;
context.UpdateObject(record);
}
context.SaveChanges();
//let the pipeline flow and the transaction complete ...
For more background info refer to http://www.crmsoftwareblog.com/2012/01/implementing-robust-microsoft-dynamics-crm-2011-auto-numbering-using-transactions/

use c# threadpool or task to call a function and get return value

I am a newbie to c# threads and need help in implementing a basic task.
I am using below code currently (without using thread) which runs fine.
The concept is to loop through records of a table, pass some table arguments in a function and except a return value and then update the table with the return value.
cmd = new OleDbCommand { Connection = con, CommandText = "Select recid,col_A,col_B from tblData"};
dr = cmd.ExecuteReader();
if (dr.HasRows)
{
cmdRec = new OleDbCommand { Connection = con };
while (dr.Read())
{
sReqResult = DoProcessing(dr["col_A"].ToString(), dr["col_B"].ToString(), dr["PARAM2"].ToString());
sSql = "update tblData set STATUS='" + sReqResult + "' where recid = '" + dr["recid"] + "'";
cmdRec.CommandText = sSql;
cmdRec.ExecuteNonQuery();
}
}
dr.close();
I want to implement above functionality using threads to speed up the process so that instead of processing the records sequentely, I can run a maximum of 25 threads parallelly. but requirement is to get the return value from the function and update the same in the table.
I have read about threadpool and Tasks (in .net 4.0) but I am not sure how to implement the same. Please guide me with some sample code.
With this answer, I'm implying you want to create the async-implementation yourself, and not use an existing tool/library.
In general, you wont be able to simply "return" a value from an asynchronous context. Instead, you can have a callback that takes certain "return"-parameters (i.e. the result).
Concept example with threadpool:
if (dr.HasRows)
{
object someDataToWorkWith = "data";
Action<object> resultCallback = (theResults) =>
{
// Executed once the workItem is finished.
// Work with and/or present the results here.
};
WaitCallback workItem = (dataOrSomeDetails) =>
{
// This is the main async-part. Work with or fetch data here.
// You can also access any variables from the containing method.
// When finished working, execute callback:
resultCallback("someResults");
};
ThreadPool.QueueUserWorkItem(workItem, someDataToWorkWith);
}
Why not use asynchronous ado.net features?

Categories

Resources