I have a large application based on Dynamics CRM 2011 that in various places has code that must query for a record based upon some criteria and create it if it doesn't exist else update it.
An example of the kind of thing I am talking about would be similar to this:
stk_balance record = context.stk_balanceSet.FirstOrDefault(x => x.stk_key == id);
if(record == null)
{
record = new stk_balance();
record.Id = Guid.NewGuid();
record.stk_value = 100;
context.AddObject(record);
}
else
{
record.stk_value += 100;
context.UpdateObject(record);
}
context.SaveChanges();
In terms of CRM 2011 implementation (although not strictly relevant to this question) the code could be triggered from synchronous or asynchronous plugins. The issue is that the code is not thread safe, between checking if the record exists and creating it if it doesn't, another thread could come in and do the same thing first resulting in duplicate records.
Normal locking methods are not reliable due to the architecture of the system, various services using multiple threads could all be using the same code, and these multiple services are also load balanced across multiple machines.
In trying to find a solution to this problem that doesn't add massive amounts of extra complexity and doesn't compromise the idea of not having a single point of failure or a single point where a bottleneck could occur I came across the idea of using SQL Server application locks.
I came up with the following class:
public class SQLLock : IDisposable
{
//Lock constants
private const string _lockMode = "Exclusive";
private const string _lockOwner = "Transaction";
private const string _lockDbPrincipal = "public";
//Variable for storing the connection passed to the constructor
private SqlConnection _connection;
//Variable for storing the name of the Application Lock created in SQL
private string _lockName;
//Variable for storing the timeout value of the lock
private int _lockTimeout;
//Variable for storing the SQL Transaction containing the lock
private SqlTransaction _transaction;
//Variable for storing if the lock was created ok
private bool _lockCreated = false;
public SQLLock (string lockName, int lockTimeout = 180000)
{
_connection = Connection.GetMasterDbConnection();
_lockName = lockName;
_lockTimeout = lockTimeout;
//Create the Application Lock
CreateLock();
}
public void Dispose()
{
//Release the Application Lock if it was created
if (_lockCreated)
{
ReleaseLock();
}
_connection.Close();
_connection.Dispose();
}
private void CreateLock()
{
_transaction = _connection.BeginTransaction();
using (SqlCommand createCmd = _connection.CreateCommand())
{
createCmd.Transaction = _transaction;
createCmd.CommandType = System.Data.CommandType.Text;
StringBuilder sbCreateCommand = new StringBuilder();
sbCreateCommand.AppendLine("DECLARE #res INT");
sbCreateCommand.AppendLine("EXEC #res = sp_getapplock");
sbCreateCommand.Append("#Resource = '").Append(_lockName).AppendLine("',");
sbCreateCommand.Append("#LockMode = '").Append(_lockMode).AppendLine("',");
sbCreateCommand.Append("#LockOwner = '").Append(_lockOwner).AppendLine("',");
sbCreateCommand.Append("#LockTimeout = ").Append(_lockTimeout).AppendLine(",");
sbCreateCommand.Append("#DbPrincipal = '").Append(_lockDbPrincipal).AppendLine("'");
sbCreateCommand.AppendLine("IF #res NOT IN (0, 1)");
sbCreateCommand.AppendLine("BEGIN");
sbCreateCommand.AppendLine("RAISERROR ( 'Unable to acquire Lock', 16, 1 )");
sbCreateCommand.AppendLine("END");
createCmd.CommandText = sbCreateCommand.ToString();
try
{
createCmd.ExecuteNonQuery();
_lockCreated = true;
}
catch (Exception ex)
{
_transaction.Rollback();
throw new Exception(string.Format("Unable to get SQL Application Lock on '{0}'", _lockName), ex);
}
}
}
private void ReleaseLock()
{
using (SqlCommand releaseCmd = _connection.CreateCommand())
{
releaseCmd.Transaction = _transaction;
releaseCmd.CommandType = System.Data.CommandType.StoredProcedure;
releaseCmd.CommandText = "sp_releaseapplock";
releaseCmd.Parameters.AddWithValue("#Resource", _lockName);
releaseCmd.Parameters.AddWithValue("#LockOwner", _lockOwner);
releaseCmd.Parameters.AddWithValue("#DbPrincipal", _lockDbPrincipal);
try
{
releaseCmd.ExecuteNonQuery();
}
catch {}
}
_transaction.Commit();
}
}
I would use this in my code to create a SQL Server application lock using the unique key I am querying for as the lock name like this
using (var sqlLock = new SQLLock(id))
{
//Code to check for and create or update record here
}
Now this approach seems to work, however I am by no means any kind of SQL Server expert and am wary about putting this anywhere near production code.
My question really has 3 parts
1. Is this a really bad idea because of something I haven't considered?
Are SQL Server application locks completely unsuitable for this purpose?
Is there a maximum number of application locks (with different names) you can have at a time?
Are there performance considerations if a potentially large number of locks are created?
What else could be an issue with the general approach?
2. Is the solution actually implemented above any good?
If SQL Server application locks are usable like this, have I actually used them properly?
Is there a better way of using SQL Server to achieve the same result?
In the code above I am getting a connection to the Master database and creating the locks in there. Does that potentially cause other issues? Should I create the locks in a different database?
3. Is there a completely alternative approach that could be used that doesn't use SQL Server application locks?
I can't use stored procedures to create and update the record (unsupported in CRM 2011).
I don't want to add a single point of failure.
You can do this much easier.
//make sure your plugin runs within a transaction, this is the case for stage 20 and 40
//you can check this with IExecutionContext.IsInTransaction
//works not with offline plugins but works within CRM Online (Cloud) and its fully supported
//also works on transaction rollback
var lockUpdateEntity = new dummy_lock_entity(); //simple technical entity with as many rows as different lock barriers you need
lockUpdateEntity.Id = Guid.parse("well known guid"); //well known guid for this barrier
lockUpdateEntity.dummy_field=Guid.NewGuid(); //just update/change a field to create a lock, no matter of its content
//--------------- this is untested by me, i use the next one
context.UpdateObject(lockUpdateEntity);
context.SaveChanges();
//---------------
//OR
//--------------- i use this one, but you need a reference to your OrganizationService
OrganizationService.Update(lockUpdateEntity);
//---------------
//threads wait here if they have no lock for dummy_lock_entity with "well known guid"
stk_balance record = context.stk_balanceSet.FirstOrDefault(x => x.stk_key == id);
if(record == null)
{
record = new stk_balance();
//record.Id = Guid.NewGuid(); //not needed
record.stk_value = 100;
context.AddObject(record);
}
else
{
record.stk_value += 100;
context.UpdateObject(record);
}
context.SaveChanges();
//let the pipeline flow and the transaction complete ...
For more background info refer to http://www.crmsoftwareblog.com/2012/01/implementing-robust-microsoft-dynamics-crm-2011-auto-numbering-using-transactions/
Related
We had performance issues in our app due to one by one updates of entities in a DB table where the number of rows was high (more than a million). We kept getting deadlock victim errors so it was obvious that the table was locking rows longer than it should have.
Currently, we implemented a manual batching of configurable limit/time threshold of stored procedure calls to update entities in the DB.
The simplified class would look something like this:
public class EntityBatchUpdater
{
private readonly IRepository _repository;
private List<Entity> _batch = new List<Entity>();
private readonly Timer _batchPostingTimer;
private readonly int _batchSize;
private static readonly object _batchPostingLock = new object();
public EntityBatchUpdater(IRepository repository)
{
_repository = repository;
_batchSize = 1000; // configurable
string batchPostingPeriod = "00:01:00.00"; // configurable
_batchPostingTimer = new Timer
{
Interval = TimeSpan.Parse(batchPostingPeriod).TotalMilliseconds,
Enabled = true,
AutoReset = true,
};
_batchPostingTimer.Elapsed += OnTimedEvent;
}
public void Update(Entity entity)
{
try
{
lock (_batchPostingLock)
{
_batch.Add(entity);
if (_batch.Count == _batchSize)
{
EntityBatchUpdate();
}
}
}
catch (Exception ex)
{
_logger.LogError(ex, $"Failed to insert batch {JsonConvert.SerializeObject(batch)}");
}
}
private void EntityBatchUpdate()
{
if (_batch.Count == 0)
{
return;
}
try
{
var entityBatchXML = SerializePayload();
_repository.BatchUpdate(entityBatchXML);
_batch.Clear();
}
catch (Exception ex)
{
_logger.LogError(ex, $"Failed to insert batch; batchSize:{_batch.Count}");
}
}
private string SerializePayload()
{
using (var sw = new System.IO.StringWriter())
{
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("", "");
var serializer = new XmlSerializer(typeof(List<Entity>),
new XmlRootAttribute("ENTITY_BATCH"));
serializer.Serialize(sw, _batch, ns);
return sw.ToString();
}
}
private void OnTimedEvent(object source, ElapsedEventArgs e)
{
EntityBatchUpdate();
}
}
Currently, we took advantage of SQL Server's fast XML processing and serialize payload into XML when calling the actual procedure to avoid hitting the DB with a lot of calls.
I also thought about creating a Table Valued Param to serialize the data we need to send to the proc, but I don't think that would drastically improve the performance.
My question is: How did you handle great load like this on your DB?
1.) Did you use a nuget package/some other tool to handle batching for you?
2.) Did you solve this using some other practice?
Edit:
To give a bit more insight: We are currently processing a queue manually in our app (hence the huge number of updates), and we want to do it as fast and as reliable as possible.
We will move to a better queueing mechanism in the future (RabbtiMQ or Kafka), but in the meantime, we want to have a standard approach of consuming and processing queues from a DB table.
The most confusing part, is that you are doing this work from C# code in the first place. You work on > 1.000.000 Records. That is not a work you should be doing in code, ever. I do not really see what you are doing with those Records, but as a general just keep that scale of work in the Database. Always try to do as much filtering, inserting, bulk inserting, bulk updating and the like on the DB side.
Never move what could be SQL to a client programm. At best you add the network load of having to move the data over the network once or even twice (for Updates). At worst you add a huge danger for race conditions. And at no point will you have a chance to beat the time the DB server would need for the same operation.
This seems to be just a Bulk Insert. SQL has a seperate command for it and any DBMS worth it's Diskspace has the option to import data from variety of file formats, inlcluding .csv.
As learning exercise and before trying to use any ORM (like EF) I want to build a personal project using ADO.NET and stored procedures.
Because I don't want my code to become a mess over time, I want to use some patterns like the repository and UoW patterns.
I've got almost everything figured it out, except for the transaction handling.
To somehow 'simulate' a UoW, I used this class provided by #jgauffin, but what's stopping me from using that class is that every time you create a new instance of that class (AdoNetUnitOfWork) you're automatically beginning a transaction and there a lot of cases where you only need to read data.
In this regard this is what I found in one of the SQL books I've been reading:
Executing a SELECT statement within a transaction can create locks on the referenced tables, which can in turn block other users or sessions from performing work or reading data
This is the AdoNetUnitOfWork class:
public class AdoNetUnitOfWork : IUnitOfWork
{
public AdoNetUnitOfWork(IDbConnection connection, bool ownsConnection)
{
_connection = connection;
_ownsConnection=ownsConnection;
_transaction = connection.BeginTransaction();
}
public IDbCommand CreateCommand()
{
var command = _connection.CreateCommand();
command.Transaction = _transaction;
return command;
}
public void SaveChanges()
{
if (_transaction == null)
throw new InvalidOperationException("Transaction have already been commited. Check your transaction handling.");
_transaction.Commit();
_transaction = null;
}
public void Dispose()
{
if (_transaction != null)
{
_transaction.Rollback();
_transaction = null;
}
if (_connection != null && _ownsConnection)
{
_connection.Close();
_connection = null;
}
}
}
And this is how I want to use the UoW in my repositories:
public DomainTable Get(int id)
{
DomainTable table;
using (var commandTable = _unitOfWork.CreateCommand())
{
commandTable.CommandType = CommandType.StoredProcedure;
//This stored procedure contains just a simple SELECT statement
commandTable.CommandText = "up_DomainTable_GetById";
commandTable.Parameters.Add(commandTable.CreateParameter("#pId", id));
table = ToList(commandTable).FirstOrDefault();
}
return table;
}
I know I can tweak this code a bit so that the transaction would be optional, but since I trying to make this code as platform independent as possible and as far as I know in other persistence frameworks like EF you don't have to manage transactions manually, the question is, will I be creating some kind of bottleneck by using this class as it is, that is, with transactions always being created?
It all depends on the transaction isolation level. Using the default isolation level (ie. read committed) then your SELECT should occur no performance penalty if is wrapped in a transaction. SQL Server internally wraps statements in a transaction anyway if one is not already started, so your code should behave almost identical.
However, I must ask you why not use the built-in .Net TransactionScope? This way your code will interact much better with other libraries and frameworks, since TransactionScope is universally used. If you do decide to switch to this I must warn you that, by default, TransactionScope uses SERIALIZABLE isolation level and this does result in performance penalties, see using new TransactionScope() Considered Harmful.
I have a fairly simple WPF application that uses Entity Framework. The main page of the application has a list of records that I am getting from a database on startup.
Each record has a picture, so the operation can be a little slow when the wireless signal is poor. I'd like this (and many of my SQL operations) to perform in the background if possible. I have async/await setup and at first it seemed to be working exactly as I wanted but now I'm seeing that my application is becoming unresponsive when accessing the DB.
Eventually I'm thinking I'm going to load up the text in one query and the images in another background operation and load them as they come in. This way I get the important stuff right away and the pictures can come in in the background, but the way things are going it's still looking like it will lock up if I do this.
On top of that, I'm trying to implement something to handle connectivity issues (in case the wifi cuts out momentarily) so that the application notifies the user of a connection issue, automatically retries a few times, etc. I put a try catch for SQL exception which seems to be working for me, but the whole application locks up for about a minute while it is trying to connect to the DB.
I tried testing my async/await using await Task.Delay() and everything is very responsive as expected while awaiting the delay, but everything locks up when awaiting the .ToListAsync(). Is this normal and expected? My understanding of async/await is pretty limited.
My code is kind of messy (I'm new) but it does what I need it to do for the most part. I understand there's probably plenty of improvements I can make and better ways to do things, but one step at a time here. My main goal right now is to keep the application from crashing during database accessing exceptions and to keep the user notified of what the application is doing (searching, trying to access db, unable to reach DB and retrying, etc) as opposed to being frozen, which is what they're going to think when they see it being unresponsive for over a minute.
Some of my code:
In my main view model
DataHelper data = new DataHelper();
private async void GetQualityRegisterQueueAsync()
{
try
{
var task = data.GetQualityRegisterAsync();
IsSearching = true;
await task;
IsSearching = false;
QualityRegisterItems = new ObservableCollection<QualityRegisterQueue>(task.Result);
OrderQualityRegisterItems();
}
catch (M1Exception ex)
{
Debug.WriteLine(ex.Message);
Debug.WriteLine("QualityRegisterLogViewModel.GetQualityRegisterQueue() Operation Failed");
}
}
My Data Helper Class
public class DataHelper
{
private bool debugging = false;
private const int MAX_RETRY = 2;
private const double LONG_WAIT_SECONDS = 5;
private const double SHORT_WAIT_SECONDS = 0.5;
private static readonly TimeSpan longWait = TimeSpan.FromSeconds(LONG_WAIT_SECONDS);
private static readonly TimeSpan shortWait = TimeSpan.FromSeconds(SHORT_WAIT_SECONDS);
private enum RetryableSqlErrors
{
ServerNotFound = -1,
Timeout = -2,
NoLock = 1204,
Deadlock = 1205,
}
public async Task<List<QualityRegisterQueue>> GetQualityRegisterAsync()
{
if(debugging) await Task.Delay(5000);
var retryCount = 0;
using (M1Context m1 = new M1Context())
{
for (; ; )
{
try
{
return await (from a in m1.QualityRegisters
where (a.qanClosed == 0)
//orderby a.qanAssignedDate descending, a.qanOpenedDate
orderby a.qanAssignedDate.HasValue descending, a.qanAssignedDate, a.qanOpenedDate
select new QualityRegisterQueue
{
QualityRegisterID = a.qanQualityRegisterID,
JobID = a.qanJobID.Trim(),
JobAssemblyID = a.qanJobAssemblyID,
JobOperationID = a.qanJobOperationID,
PartID = a.qanPartID.Trim(),
PartRevisionID = a.qanPartRevisionID.Trim(),
PartShortDescription = a.qanPartShortDescription.Trim(),
OpenedByEmployeeID = a.qanOpenedByEmployeeID.Trim(),
OpenedByEmployeeName = a.OpenedEmployee.lmeEmployeeName.Trim(),
OpenedDate = a.qanOpenedDate,
PartImage = a.JobAssembly.ujmaPartImage,
AssignedDate = a.qanAssignedDate,
AssignedToEmployeeID = a.qanAssignedToEmployeeID.Trim(),
AssignedToEmployeeName = a.AssignedEmployee.lmeEmployeeName.Trim()
}).ToListAsync();
}
catch (SqlException ex)
{
Debug.WriteLine("SQL Exception number = " + ex.Number);
if (!Enum.IsDefined(typeof(RetryableSqlErrors), ex.Number))
throw new M1Exception(ex.Message, ex);
retryCount++;
if (retryCount > MAX_RETRY) throw new M1Exception(ex.Message, ex); ;
Debug.WriteLine("Retrying. Count = " + retryCount);
Thread.Sleep(ex.Number == (int)RetryableSqlErrors.Timeout ?
longWait : shortWait);
}
}
}
}
}
Edit: Mostly looking for general guidance here, though a specific example of what to do would be great. For these types of operations where I am downloading data, is it just a given that if I need the application to be responsive I need to be making multiple threads? Is that a common solution to this type of problem? Is this not something I should be expecting async/await to solve?
If you call this method from your UI thread, you will overload the capture of UI thread context and back on itself. Also, your service will not be necessarily "Performant" because it must wait until the UI thread is free before it can continue.
The solution is simple: just call the method passing the ConfigureAwait "false" parameter when you made the call.
.ToListAsync().ConfigureAwaiter(false);
I hope it helps
Background: I'm writing a function putting long lasting operations in a queue, using C#,
and each operation is kind of divided into 3 steps:
1. database operation (update/delete/add data)
2. long time calculation using web service
3. database operation (save the calculation result of step 2) on the same db table in step 1, and check the consistency of the db table, e.g., the items are the same in step 1 (Pls see below for a more detailed example)
In order to avoid dirty data or corruptions, I use a lock object (a static singleton object) to ensure the 3 steps to be done as a whole transaction. Because when multiple users are calling the function to do operations, they may modify the same db table at different steps during their own operations without this lock, e.g., user2 is deleting item A in his step1, while user1 is checking if A still exists in his step 3. (additional info: Meanwhile I'm using TransactionScope from Entity framework to ensure each database operation as a transaction, but as repeat readable.)
However, I need to put this to a cloud computing platform which uses load balancing mechanism, so actually my lock object won't take effect, because the function will be deployed on different servers.
Question: what can I do to make my lock object working under above circumstance?
This is a tricky problem - you need a distributed lock, or some sort of shared state.
Since you already have the database, you could change your implementation from a "static C# lock" and instead the database to manage your lock for you over the whole "transaction".
You don't say what database you are using, but if it's SQL Server, then you can use an application lock to achieve this. This lets you explicitly "lock" an object, and all other clients will wait until that object is unlocked. Check out:
http://technet.microsoft.com/en-us/library/ms189823.aspx
I've coded up an example implementation below. Start two instances to test it out.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Transactions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var locker = new SqlApplicationLock("MyAceApplication",
"Server=xxx;Database=scratch;User Id=xx;Password=xxx;");
Console.WriteLine("Aquiring the lock");
using (locker.TakeLock(TimeSpan.FromMinutes(2)))
{
Console.WriteLine("Lock Aquired, doing work which no one else can do. Press any key to release the lock.");
Console.ReadKey();
}
Console.WriteLine("Lock Released");
}
class SqlApplicationLock : IDisposable
{
private readonly String _uniqueId;
private readonly SqlConnection _sqlConnection;
private Boolean _isLockTaken = false;
public SqlApplicationLock(
String uniqueId,
String connectionString)
{
_uniqueId = uniqueId;
_sqlConnection = new SqlConnection(connectionString);
_sqlConnection.Open();
}
public IDisposable TakeLock(TimeSpan takeLockTimeout)
{
using (TransactionScope transactionScope = new TransactionScope(TransactionScopeOption.Suppress))
{
SqlCommand sqlCommand = new SqlCommand("sp_getapplock", _sqlConnection);
sqlCommand.CommandType = CommandType.StoredProcedure;
sqlCommand.CommandTimeout = (int)takeLockTimeout.TotalSeconds;
sqlCommand.Parameters.AddWithValue("Resource", _uniqueId);
sqlCommand.Parameters.AddWithValue("LockOwner", "Session");
sqlCommand.Parameters.AddWithValue("LockMode", "Exclusive");
sqlCommand.Parameters.AddWithValue("LockTimeout", (Int32)takeLockTimeout.TotalMilliseconds);
SqlParameter returnValue = sqlCommand.Parameters.Add("ReturnValue", SqlDbType.Int);
returnValue.Direction = ParameterDirection.ReturnValue;
sqlCommand.ExecuteNonQuery();
if ((int)returnValue.Value < 0)
{
throw new Exception(String.Format("sp_getapplock failed with errorCode '{0}'",
returnValue.Value));
}
_isLockTaken = true;
transactionScope.Complete();
}
return this;
}
public void ReleaseLock()
{
using (TransactionScope transactionScope = new TransactionScope(TransactionScopeOption.Suppress))
{
SqlCommand sqlCommand = new SqlCommand("sp_releaseapplock", _sqlConnection);
sqlCommand.CommandType = CommandType.StoredProcedure;
sqlCommand.Parameters.AddWithValue("Resource", _uniqueId);
sqlCommand.Parameters.AddWithValue("LockOwner", "Session");
sqlCommand.ExecuteNonQuery();
_isLockTaken = false;
transactionScope.Complete();
}
}
public void Dispose()
{
if (_isLockTaken)
{
ReleaseLock();
}
_sqlConnection.Close();
}
}
}
}
I'm dealing with a courious scenario.
I'm using EntityFramework to save (insert/update) into a SQL database in a multithreaded environment. The problem is i need to access database to see whether a register with a particular key has been already created in order to set a field value (executing) or it's new to set a different value (pending). Those registers are identified by a unique guid.
I've solved this problem by setting a lock since i do know entity will not be present in any other process, in other words, i will not have same guid in different processes and it seems to be working fine. It looks something like that:
static readonly object LockableObject = new object();
static void SaveElement(Entity e)
{
lock(LockableObject)
{
Entity e2 = Repository.FindByKey(e);
if (e2 != null)
{
Repository.Insert(e2);
}
else
{
Repository.Update(e2);
}
}
}
But this implies when i have a huge ammount of requests to be saved, they will be queued.
I wonder if there is something like that (please, take it just as an idea):
static void SaveElement(Entity e)
{
(using ThisWouldBeAClassToProtectBasedOnACondition protector = new ThisWouldBeAClassToProtectBasedOnACondition(e => e.UniqueId)
{
Entity e2 = Repository.FindByKey(e);
if (e2 != null)
{
Repository.Insert(e2);
}
else
{
Repository.Update(e2);
}
}
}
The idea would be having a kind of protection that protected based on a condition so each entity e would have its own lock based on e.UniqueId property.
Any idea?
Don't use application-locks where database transactions or constraints are needed.
The use of a lock to prevent duplicate entries in a database is not a good idea. It limits the scalability of your application be forcing only a single instance to ever exist that can add or update such records. Or worse, someone will eventually try to scale the application to multiple processes or servers and it will cause data corruption (since locks are local to a single process).
What you should consider instead is using a combination of unique constraints in the database and transactions to ensure that no two attempts to add the same entry can both succeed. One will succeed - the other will be forced to rollback.
This might work for you, you can just lock on the instance of e:
lock(e)
{
Entity e2 = Repository.FindByKey(e);
if (e2 != null)
{
Repository.Insert(e2);
}
else
{
Repository.Update(e2);
}
}