How to implement locking across a network

How to implement locking across a network - c#

I have a desktop application. In this application there many records that users can open and work on. If a user clicks on a record the program will lock the record so no one else can use it. If the record is already locked then the user may still view it but it will be read-only. Many users on our local network can open and work on records.
My first thought is to use the database to manage locks on records. But I am not sure how or if this is the best approach. Is there any programming patterns or ready made solutions I can use?

I've implemented a similar system for a WPF application accessing a database, however I no longer have access to the source code, I'll try to explain here. The route I took was somewhat different from using the database. Using a Duplex WCF service you can host a service somewhere (i.e. the database server) from which clients connect. Key things to understand:
You can make this service generic by having some kind of data type and by making sure each row type has the same type of primary key (e.g. a long). In that case, you could have a signature similar to: bool AcquireLock(string dataType, long id) or replacing the bool/long by bool[] and long[] if users frequently modify a larger number of rows.
On the server side, you must be able to quickly respond to this request. Consider storing the data in something along the lines of a Dictionary<String (DataType), Dictionary<User, HashSet<long>> where the root string is a datatype.
When someone connects, he can receive a list of all locks for a given data type (e.g. when a screen opens that locks that type of records), while also registering to receive updates for a given data type.
The socket connection between the client as the server defines that the user is 'connected'. If the socket closes, the server releases all locks for that user, immediately notifying others that the user has lost his lock, making the record available again for editing. (This covers scenarios such as a user disconnecting or killing a process).
To avoid concurrency issues, make sure a user acquired the lock before allowing him to make any changes. (e.g. BeginEdit, check with the server first, by implementing IEditableObject on your view model).
When a lock is released, the client tells the server if he made changes to the row, so that other clients can update the respective data. When the socket disconnects, assume no changes.
Nice feature to add: when providing users with a list / update of locks, also provide the user id, so that people can see who is working on what.
This form of 'real time concurrency' provides a much better user experience than providing a way to handle optimistic concurrency problems, and might also be technically easier to implement, depending on your scenario.

Related

What is the best Method for monitoring a large number of clients reliably with good performance

This is more of a programming strategy and direction question, than the actual code itself.
I am programming in C-Sharp.
I have an application that remotely starts processes on many different clients on the network, could be up to 1000 clients in theory.
It then monitors the status of the remote processes by reading a log file on each client.
I currently do this by running one thread that loops through all of the clients in a list, and reading the log file. It works fine for 10 or 20 machines, but 1000 would probably be untenable.
There are several problems with this approach:
First, if the thread doesn’t finish reading all of the client statuses before it’s called again, the client statuses at the end of the list might not be read and updated.
Secondly, if any client in the list goes offline during this period, the updating hangs, until that client is back online again.
So I require a different approach, and have thought up a few possible ways to resolve this.
Spawn a separate thread for each client, to read their log file and update its progress.
a. However, I’m not sure if having 1000 threads running on my machine is something that would be acceptable.
Test the connect for each machine first, before trying to read the file, and if it cannot connect, then just ignore it for that iteration and move on to the next client in the list.
a. This still has the same problem of not getting through the list before the next call, and causes more delay and it tries to test the connection via a port first. With 1000 clients, this would be noticeable.
Have each client send the data to the machine running the application whenever there is an update.
a. This could create a lot of chatter with 1000 machines trying to send data repeatedly.
So I’m trying to figure if there is another more efficient and reliable method, that I haven’t considered, or which one of these would be the best.
Right now I’m leaning towards having the clients send updates to the application, instead of having the application pulling the data.
Looking for thoughts, concerns, ideas and recommendations.

In my opinion, you are doing this (Monitoring) the wrong way. Instead of keeping all logs in a text file, you'd better preserve them in a central data repository that can be of any kind. With respect to the fact that you are monitoring the performance of those system, your design and the mechanism behind it must not impact the performance of the target systems negatively, and with this design the disk and CPU would be involved so much in certain cases that can result in a performance issue itself.
I recommend you to create a log repository server using a fast in-memory database like Redis, and send logged data directly to that server. Keep in mind that this database must be running on a different virtual machine. You can then tune Redis to store received data on physical Disk once a particular number of indexes are reached or a particular interval elapses. The in-memory feature here is advantageous as you may need to query information a lot in a monitoring application like this. On the other hand, the performance of Redis is so high that it efficiently passes processing millions of indexes.
The blueprint for you is that:
1- Centralize all log data in a single repository.
2- Configure clients to send monitored information to the centralized repository.
3- Read the data from the centralized repository by the main server (monitoring system) when required.
I'm not trying to advertise for a particular tool here as I'm only sharing my own experience. There's many more tools that you can use for this purpose such as ElasticSearch.

Listening events in a web service or API over Database changes

I have this scenario, and I don't really know where to start. Suppose there's a Web service-like app (might be API tho) hosted on a server. That app receives a request to proccess some data (through some method we will call processData(data theData)).
On the other side, there's a robot (might be installed on the same server) that procceses the data. So, The web-service inserts the request on a common Database (both programms have access to it), and it's supposed to wait for that row to change and send the results back.
The robot periodically check the database for new rows, proccesses the data and set some sort of flag to that row, indicating that the data was processed.
So the main problem here is, what should the method proccessData(..) do to check for the changes of the data row?.
I know one way to do it: I can build an iteration block that checks for the row every x secs. But i don't want to do that. What I want to do is to build some sort of event listener, that triggers when the row changes. I know it might involve some asynchronous programming
I might be dreaming, but is that even possible in a web enviroment.?
I've been reading about a SqlDependency class, Async and AWait classes, etc..

Depending on how much control you have over design of this distributed system, it might be better for its architecture if you take a step back and try to think outside the domain of solutions you have narrowed the problem down to so far. You have identified the "main problem" to be finding a way for the distributed services to communicate with each other through the common database. Maybe that is a thought you should challenge.
There are many potential ways for these components to communicate and if your design goal is to reduce latency and thus avoid polling, it might in fact be the right way for the service that needs to be informed of completion of this work item to be informed of it right away. However, if in the future the throughput of this system has to increase, processing work items in bulk and instead poll for the information might become the only feasible option. This is also why I have chosen to word my answer a bit more generically and discuss the design of this distributed system more abstractly.
If after this consideration your answer remains the same and you do want immediate notification, consider having the component that processes a work item to notify the component(s) that need to be notified. As a general design principle for distributed systems, it is best to have the component that is most authoritative for a given set of data to also be the component to answer requests about that data. In this case, the data you have is the completion status of your work items, so the best component to act on this would be the component completing the work items. It might be better for that component to inform calling clients and components of that completion. Here it's also important to know if you only write this data to the database for the sake of communication between components or if those rows have any value beyond the completion of a given work item, such as for reporting purposes or performance indicators (KPIs).
I think there can be valid reasons, though, why you would not want to have such a call, such as reducing coupling between components or lack of access to communicate with the other component in a direct manner. There are many communication primitives that allow such notification, such as MSMQ under Windows, or Queues in Windows Azure. There are also reasons against it, such as dependency on a third component for communication within your system, which could reduce the availability of your system and lead to outages. The questions you might want to ask yourself here are: "How much work can my component do when everything around it goes down?" and "What are my design priorities for this system in terms of reliability and availability?"
So I think the main problem you might want to really try to solve fist is a bit more abstract: how should the interface through which components of this distributed system communicate look like?
If after all of this you remain set on having the interface of communication between those components be the SQL database, you could explore using INSERT and UPDATE triggers in SQL. You can easily look up the syntax of those commands and specify Stored Procedures that then get executed. In those stored procedures you would want to check the completion flag of any new rows and possibly restrain the number of rows you check by date or have an ID for the last processed work item. To then notify the other component, you could go as far as using the built-in stored procedure XP_cmdshell to execute command lines under Windows. The command you execute could be a simple tool that pings your service for completion of the task.
I'm sorry to have initially overlooked your suggestion to use SQL Query Notifications. That is also a feasible way and works through the Service Broker component. You would define a SqlCommand, as if normally querying your database, pass this to an instance of SqlDependency and then subscribe to the event called OnChange. Once you execute the SqlCommand, you should get calls to the event handler you added to OnChange.
I am not sure, however, how to get the exact changes to the database out of the SqlNotificationEventArgs object that will be passed to your event handler, so your query might need to be specific enough for the application to tell that the work item has completed whenever the query changes, or you might have to do another round-trip to the database from your application every time you are notified to be able to tell what exactly has changed.

Are you referring to a Message Queue? The .Net framework already provides this facility. I would say let the web service manage an application level queue. The robot will request the same web service for things to do. Assuming that the data needed for the jobs are small, you can keep the whole thing in memory. I would rather not involve a database, if you don't already have one.

Consuming SQL Server data events for messaging purposes

At our organization we have a SQL Server 2005 database and a fair number of database clients: web sites (php, zope, asp.net), rich clients (legacy fox pro). Now we need to pass certain events from the core database with other systems (MongoDb, LDAP and others). Messaging paradigm seems pretty capable of solving this kind of problem. So we decided to use RabbitMQ broker as a middleware.
The problem of consuming events from the database at first seemed to have only two possible solutions:
Poll the database for outgoing messages and pass them to a message broker.
Use triggers on certain tables to pass messages to a broker on the same machine.
I disliked the first idea due to latency issues which arise when periodical execution of sql is involved.
But event-based trigger approach has a problem which seems unsolvable to me at the moment. Consider this scenario:
A row is inserted into a table.
Trigger fires and sends a message (using a CLR Stored Procedure written in C#)
Everything is ok unless transaction which writes data is rolled back. In this case data will be consistent, but the message has already been sent and cannot be rolled back because trigger fires at the moment of writing to the database log, not at the time of transaction commit (which is a correct behaviour of a RDBMS).
I realize now that I'm asking too much of triggers and they are not suitable for tasks other than working with data.
So my questions are:
Has anyone managed to extract data events using triggers?
What other methods of consuming data events can you advise?
Is Query Notification (built on top of Service Broker) suitable in my situation?
Thanks in advance!

Lest first cut out of the of the equation the obvious misfit: Query Notification is not right technology for this, because is designed to address cache invalidation of relatively stable data. With QN you'll only know that table has changed, but you won't be able to know what had changed.
Kudos to you for figuring out why triggers invoking SQLCRL won't work: the consistency is broken on rollback.
So what does work? Consider this: BizTalk Server. In other words, there is an entire business built around this problem space, and solutions are far from trivial (otherwise nobody would buy such products).
You can get quite far though following a few principles:
decoupling. Event based triggers are OK, but do not send the message from the trigger. Aside from the consistency issue on rollback you also have the latency issue of having every DML operation now wait for an external API call (the RabbitMQ send) and the availability issue of the external API call failure (if RabbitMQ is unavailable, your DB is unavailable). The solution is to have the trigger use ordinary tables as queues, the trigger will enqueue a message in the local db queue (ie. will insert into this table) and and external process will service this queue by dequeueing the messages (ie. delete from the table) and forwarding them to RabbitMQ. This decouples the transaction from the RabbitMQ operation (the external process is able to see the message only if the original xact commits), but the cost is some obvious added latency (there is an extra hop involved, the local table acting as a queue).
idempotency. Since RabbitMQ cannot enroll in distributed transactions with the database you cannot guarantee atomicity of the DB operation (the dequeue from local table acting as queue) and the RabbitMQ operation (the send). Either one can succeed when the other failed, and there is simply no way around it w/o explicit distributed transaction enrollment support. Which implies that the application will send duplicate messages every once in a while (usually when things already go bad for some reason). And a quick heads up: enrolling into the act of explicit 'acknowledge' messages and send sequence numbers is a loosing battle as you'll quickly discover that you're reinventing TCP on top of messaging, that road is paved with bodies.
tolerance. For the same reasons as the item above every now in a while a message you believe was sent will never make it. Again, what damage this causes is entirely business specific. The issue is not how to prevent this situation (is almost impossible...) but how to detect this situation, and what to do about it. No silver bullet, I'm afraid.
You do mention in passing Service Broker (the fact that is powering Query Notification is the least interestign aspect of it...). As a messaging platform built into SQL Server which offers Exactly Once In Order delivery guarantees and is fully transacted it would solve all the above pain points (you can SEND from triggers withouth impunity, you can use Activation to solve the latency issue, you'll never see a duplicate or a missing message, there are clear error semantics) and some other pain points I did not mention before (consistency of backup/restore as the data and the messages are on the same unit of storage - the database, cosnsitnecy of HA/DR failover as SSB support both database mirroring and clustering etc). The draw back though is that SSB is only capable of talking to another SSB service, in other words it can only be used to exchange messages between two (or more) SQL Server instances. Any other use requires the parties to use a SQL Server to exchange messages. But if your endpoints are all SQL Server, then consider that there are some large scale deployments using Service Broker. Note that endpoints like php or asp.net can be considered SQL Server endpoints, they are just programming layers on top of the DB API, a different endpoint would, say, the need to send messages from handheld devices (phones) directly to the database (and eve those 99% of the time go through a web service, which means they can reach a SQL Server ultimately). Another consideration is that SSB is geared toward throughput and reliable delivery, not toward low latency. Is definitely not the technology to use to get back the response in a HTTP web request, for instance. IS the technology to use to submit for processing something triggered by a web request.

Remus's answer lays out some sound principals for generating and handling events. You can initiate the pushing of events from a trigger to achieve low latency.
You can achieve everything necessary from a trigger. We will still decouple this into two components: a trigger that generates the events and a local reader that reads the events.
The first component is the trigger.
Make a CLR trigger that prepares what needs to be done when the transaction commits.
Create a System.Transactions.IEnlistmentNotification that always agrees to be prepared, and whose void Commit(System.Transactions.Enlistment) method executes the prepared action.
In the trigger, call System.Transactions.Transaction.Current.EnlistVolatile(enlistmentNotification, System.Transactions.EnlistmentOptions.None)
You'll want your action to be short and sweet, like appending the data to a lockless queue in memory or updating some other state in memory. Don't try to communicate with other machines or processes. Don't write to a disk (if you wanted to write to a disk, just make an ordinary trigger that inserts into a queue table). You'll need to be careful to make sure your assembly is loaded only once so that any shared static state will be unique; this is easiest to do if your static state is in a top level assembly that isn't referenced by other assemblies, so no other assemblies will try to load it.
You will also need to either
initialize your state in such a way that it will be correct even if the system was restarted without sending all the previously queued messages (since a short, in memory queue will not be durable). This means you might be resending messages, so they will need to be idempotent. or
rely on the tolerance of another component to pick up on missed messages
The second component reads the state that is update by the trigger. Make a separate CLR component that reads from your queue or state, and does whatever you need done (like send an idempotent message to a messaging system, record that it was sent, whatever). If this component can fail (hint: it can), you will need some form of tolerance, which may belong in another system. You can achieve low latency by having the trigger signal the second component when new state is available.
One architectural possibility is to have the trigger put the event in memory on commit for another low-latency component to pick up and have the second component send a low-latency, low-reliability copy of an idempotent message. You can pair that with a more reliably or durable messaging system, such as SSB, that will reliably and durably, but with grater latency, send the same idempotent message later.

Threads in asp.net (C#)

When a user visits an .aspx page, I need to start some background calculations in a new thread. The results of the calculations need to be stored in the user's Session, so that on a callback, the results can be retrieved. Additionally, on the callback, I need to be able to see what the status of the background calculation is. (E.g. I need to check if the calculation is finished and completed successfully, or if it is still running) How can I accomplish this?
Questions
How would I check on the status of the thread? Multiple users could have background calculations running at the same time, so I'm unsure how the process of knowing which thread belongs to which user would work.. (though in my scenario, the only thread that matters, is the thread originally started by user A -- and user A does a callback to retrieve/check on the status of that thread).
Am I correct in my assumption that passing an HttpSessionState "Session" variable for the user to the new thread, will work as I expect (e.g. I can then add stuff to their Session later).
Thanks. Also I have to say, I might be confused about something but it seems like the SO login system is different now, so I don't have access to my old account.
Edit
I'm now thinking about using the approach described in this article which basically uses a class and a Singleton to manage a list of threads. Instead of storing my data in the database (and incurring the performance penalty associated with retrieving the data, as well as the extra table, maintenance, etc in the database), I'll probably store the data in my class as well.
Edit 2
The approach mentioned in my first edit worked well. Additionally I had timers to ensure the threads, and their associated data, were both cleaned up after the corresponding timers called their cleanup methods. The Objects containing my data and the threads were stored in the Singleton class. For some applications it might be appropriate to use the database for storage but it seemed like overkill for mine, since my data is tied to a specific instance of a page, and is useless outside of that page context.

I would not expect session-state to continue working in this scenario; the worker may have no idea who the user is, and even if it does (or more likely: you capture this data into the worker), no reason to store anything (updating session is a step towards the end of the request pipeline; but if you aren't in the pipeline...?).
I suspect you might need to store this data separately using some unique property of the user (their id or cn), or invent a GUID otherwise. On a single machine it may suffice to store this in a synchronised dictionary (or similar), but on a farm/cluster you may need to push the data down a layer to your database or state server. And fetch manually.

How to notify a windows service(c#) of a DB Table Change(sql 2005)?

I have a table with a heavy load(many inserts/updates/deletes) in a SQL2005 database. I'd like to do some post processing for all these changes in as close to real time as possible(asynchronously so as not to lock the table in any way). I've looked a number of possible solutions but just can't seem to find that one neat solution that feels right.
The kind of post processing is fairly heavy as well, so much so that the windows listener service is actually going to pass the processing over to a number of machines. However this part of the application is already up and running, completetly asynchronous, and not what I need help with - I just wanted to mention this simply because it affects the design decision in that we couldn't just load up some CLR object in the DB to complete the processing.
So, The simple problem remains: data changes in a table, I want to do some processing in c# code on a remote server.
At present we've come up with using a sql trigger, which executes "xp_cmdshell" to lauch an exe which raises an event which the windows service is listening for. This just feels bad.
However, other solutions I've looked at online feel rather convoluted too. For instance setting up SQLCacheDependancy also involves having to setup Service broker. Another possible solution is to use a CLR trigger, which can call a webservice, but this has so many warnings online about it being a bad way to go about it, especially when performance is critical.
Idealy we wouldn't depnd on the table changes but would rather intercept the call inside our application and notify the service from there, unfortunately though we have some legacy applications making changes to the data too, and monitoring the table is the only centralised place at the moment.
Any help would be most appreciated.
Summary:
Need to respond to table data changes in real time
Performance is critical
High volume of traffic is expected
Polling and scheduled tasks are not an option(or real time)
Implementing service broker too big (but might be only solution?)
CLR code is not yet ruled out, but needs to be perfomant if suggested
Listener / monitor may be remote machine(likely to be same phyisical network)

You really don't have that many ways to detect changes in SQL 2005. You already listed most of them.
Query Notifications. This is the technology that powers SqlDependency and its derivatives, you can read more details on The Mysterious Notification. But QN is designed to invalidate results, not to pro-actively notify change content. You will only know that the table has changes, without knowing what changed. On a busy system this will not work, as the notifications will come pretty much continously.
Log reading. This is what transactional replication uses and is the least intrusive way to detect changes. Unfortunately is only available to internal components. Even if you manage to understand the log format, the problem is that you need support from the engine to mark the log as 'in use' until you read it, or it may be overwritten. Only transactional replication can do this sort of special marking.
Data compare. Rely on timestamp columns to detect changes. Is also pull based, quite intrussive and has problems detecting deletes.
Application Layer. This is the best option in theory, unless there are changes occuring to the data outside the scope of the application, in which case it crumbles. In practice there are always changes occuring outside the scope of the application.
Triggers. Ultimately, this is the only viable option. All change mechanisms based on triggers work the same way, they queue up the change notification to a component that monitors the queue.
There are always suggestions to do a tightly coupled, synchronous notification (via xp_cmdshell, xp_olecreate, CLR, notify with WCF, you name it), but all these schemes fail in practice because they are fundamentally flawed:
- they do not account for transaction consistency and rollbacks
- they introduce availability dependencies (the OLTP system cannot proceed unless the notified component is online)
- they perform horribly as each DML operation has to wait for an RPC call of some form to complete
If the triggers do not actually actively notify the listeners, but only queue up the notifications, there is a problem in monitoring the notifications queue (when I say 'queue', I mean any table that acts as a queue). Monitoring implies pulling for new entries in the queue, which means balancing the frequency of checks correctly with the load of changes, and reacting to load spikes. This is not trivial at all, actually is very difficult. However, there is one statement in SQL server that has the semantics to block, without pulling, until changes become available: WAITFOR(RECEIVE). That means Service Broker. You mentioned SSB several times in your post, but you are, rightfuly so, scared of deploying it because of the big unknown. But the reality is that it is, by far, the best fit for the task you described.
You do not have to deploy a full SSB architecture, where the notificaition is delivered all the way to the remote service (that would require a remote SQL instance anyway, even an Express one). All you need to accomplice is to decouple the moment when the change is detected (the DML trigger) from the moment when the notification is delivered (after the change is commited). For this all you need is a local SSB queue and service. In the trigger you SEND a change notification to the local service. After the original DML transaction commits, the service procedure activates and delivers the notification, using CLR for instance. You can see an example of something similar to this at Asynchronous T-SQL.
If you go down that path there are some tricks you'll need to learn to achieve high troughput and you must understant the concept of ordered delivery of messages in SSB. I reommend you read these links:
Reusing Conversations
Writing Service Broker Procedures
SQL Connections 2007 Demo
About the means to detect changes, SQL 2008 apparently adds new options: Change Data Capture and Change Tracking. I emphasizes 'apparently', since they are not really new technologies. CDC uses log reader and is based on the existing Transactional replication mechanisms. CT uses triggers and is very similar to existing Merge replication mechanisms. They are both intended for occasionally connected systems that need to sync up and hence not appropiate for real-time change notification. They can populate the change tables, but you are left with the task to monitor these tables for changes, which is exactly from where you started.

This could be done in many ways. below method is simple since you dont want to use CLR triggers and sqlcmd options.
Instead of using CLR triggers you can create the normal insert trigger which updates the dedicated tracking table on each insert.
And develop dedicated window service which actively polls on the tracking table and update the remote service if there is any change in the data and set the status in tracking table to done (so it wont be picked again)..
EDIT:
I think Microsoft sync services for ADO.Net can work for you. Check out the below links. It may help you
How to: Use SQL Server Change Tracking - sql server 2008
Use a Custom Change Tracking System - below sql server 2008

In similar circumstances we are using CLR trigger that is writing messages to the queue (MSMQ). Service written in C# is monitoring the queue and doing post-processing.
In our case it is all done on the same server, but you can send those messages directly to the remote queue, on a different machine, totally bypassing "local listener".
The code called from trigger looks like this:
public static void SendMsmqMessage(string queueName, string data)
{
//Define the queue path based on the input parameter.
string QueuePath = String.Format(".\\private$\\{0}", queueName);
try
{
if (!MessageQueue.Exists(QueuePath))
MessageQueue.Create(QueuePath);
//Open the queue with the Send access mode
MessageQueue MSMQueue = new MessageQueue(QueuePath, QueueAccessMode.Send);
//Define the queue message formatting and create message
BinaryMessageFormatter MessageFormatter = new BinaryMessageFormatter();
Message MSMQMessage = new Message(data, MessageFormatter);
MSMQueue.Send(MSMQMessage);
}
catch (Exception x)
{
// async logging: gotta return from the trigger ASAP
System.Threading.ThreadPool.QueueUserWorkItem(new WaitCallback(LogException), x);
}
}

Since you said there're many inserts running on that table, a batch processing could fit better.
Why did just create a scheduled job, which handle new data identified by a flag column, and process data in large chunks?

Use the typical trigger to fire a CLR on the database. This CLR will only start a program remotely using the Win32_Process Class:
http://motevich.blogspot.com/2007/11/execute-program-on-remote-computer.html

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.