I have a WCF service which has two methods exposed:
Note: The wcf service and sql server is deployed in same machine.
Sql server has one table called employee which maintains employee information.
Read() This method retrieves all employees from sql server.
Write() This method writes (add,update,delete) employee info in employee table into sql server.
Now I have developed a desktop based application through which any client can query, add,update and delete employee information by consuming a web service.
Question:
How can I handle the scenario, if mulitple clients want update the employee information at the same time? Is the sql server itself handle this by using database locks ??
Please suggest me the best approach!!
Generally, in a disconnected environment optimistic concurrency with a rowversion/timestamp is the preferred approach. WCF does support distributed transactions, but that is a great way to introduce lengthy blocking into the system. Most ORM tools will support rowversion/timestamp out-of-the-box.
Of course, at the server you might want to use transactions (either connection-based or TransactionScope) to make individual repository methods "ACID", but I would try to avoid transactions on the wire as far as possible.
Re comments; sorry about that, I honestly didn't see those comments; sometimes stackoverflow doesn't make this easy if you get a lot of comments at once. There are two different concepts here; the waiting is a symptom of blocking, but if you have 100 clients updating the same record it is entirely appropriate to block during each transaction. To keep things simple: unless I can demonstrate a bottleneck (requiring extra work), I would start with a serializable transaction around the update operations (TransactionScope uses this by default). That way yes: you get appropriate blocking (ACID etc) for most scenarios.
However; the second issue is concurrency: if you get 100 updates for the same record, how do you know which to trust? Most systems will let the first update in, and discard the rest as they are operating on stale assumptions about the data. This is where the timestamp/rowversion come in. By enforcing "the timestamp/rowversion must match" on the UPDATE statement, you ensure that people can only update data that hasn't changed since they took their snapshot. For this purpose, it is common to keep the rowversion alongside any interesting data you are updating.
Another alternative is that you could instantiate the WCF service as a singleton (InstanceContext.Single) - which means there is only one instance of it running ever. Then, you could keep a simple object in memory for the purpose of update locking, and lock in your update method based on that object. When update calls come in from other sessions, they will have to wait until the lock is released.
Regards,
Steve
Related
I have a query on bi-directional data sync.
The scenario is, that we have ERP software running on a local network
which is developed in PowerBuilder and the database is SQL Anywhere 16, Also, we have our cloud software which is developed in .net6 and the database is Azure SQL. And also we have a Middleware developed on .net which interacts with our API and local DB. After an operation like Invoice generation, we need to keep the quantity of a product accurate the same as local DB and cloud DB. Whether the operation happened in the cloud or local network. Please share your thoughts.
The approach would depend on whether you are willing to sacrifice consistency or concurrency.
If you are willing to sacrifice consistency, which in your case would be acceptable I believe for some use cases like syncing local invoices to the cloud, your middleware could asynchronously ensure that both the databases are in sync on the side.
If you are willing to sacrifice concurrency, which would be needed to ensure quantity of a product is accurate before checking out, you would essentially use a lock to ensure the product is available and is not being checked out by someone else. This has the down side of slowing down the system since multiple requests would be waiting as previous requests are being processed.
As for the actual implementation itself, you could use a queue for each transaction which the middleware could receive and sync for the first option. And for the second option, you would need to use some kind of distributed lock for each product that your API/Middleware would need to acquire before committing changes.
Depending on the scale and business KPIs of your application, you would have to decide how to approach the second option. Here is another SO thread which has a decent discussion on various approaches based on what you are willing to sacrifice.
we are building a WinForms desktop application which talks to an SQL Server through NHibernate. After extensive research we settled on the Session / Form strategy using Ninject to inject a new ISession into each Form (or the backing controller to be precise). So far it is working decently.
Unfortunately the main Form holds a lot of data (mostly read-only) which gets stale after some time. To prevent this we implemented a background service (really just a seperate class) which polls the DB for changes and issues an event which lets the main form selectively update the changed rows.
This background service also gets a separate session to minimize interference with the other forms. Our understanding was that it is possible to open a transaction per session in parallel as long as they are not nested.
Sadly this doesn't seem to be the case and we either get an ObjectDisposedException in one of the forms or the service (because the service session used an existing transaction from on of the forms and committed it, which fails the commit in the form or the other way round) or we get an InvalidOperationException stating that "Parallel transactions are not supported by SQL Server".
Is there really no way to open more than one transaction in parallel (across separate sessions)?
And alternatively is there a better way to update stale data in a long running form?
Thanks in advance!
I'm pretty sure you have messed something up, and are sharing either session or connection instances in ways you did not intend.
It can depend a bit on which sort of transactions you use:
If you use only NHibernate transactions (session.BeginTransaction()), each session acts independently. Unless you do something special to insert your own underlying database connections (and made an error there), each session will have their own connection and transaction.
If you use TransactionScope from System.Transactions in addition to the NHibernate transactions, you need to be careful about thread handling and the TransactionScopeOption. Otherwise different parts of your code may unexpectedly share the same transaction if a single thread runs through both parts and you haven't used TransactionScopeOption.RequiresNew.
Perhaps you are not properly disposing your transactions (and sessions)?
I have this scenario, and I don't really know where to start. Suppose there's a Web service-like app (might be API tho) hosted on a server. That app receives a request to proccess some data (through some method we will call processData(data theData)).
On the other side, there's a robot (might be installed on the same server) that procceses the data. So, The web-service inserts the request on a common Database (both programms have access to it), and it's supposed to wait for that row to change and send the results back.
The robot periodically check the database for new rows, proccesses the data and set some sort of flag to that row, indicating that the data was processed.
So the main problem here is, what should the method proccessData(..) do to check for the changes of the data row?.
I know one way to do it: I can build an iteration block that checks for the row every x secs. But i don't want to do that. What I want to do is to build some sort of event listener, that triggers when the row changes. I know it might involve some asynchronous programming
I might be dreaming, but is that even possible in a web enviroment.?
I've been reading about a SqlDependency class, Async and AWait classes, etc..
Depending on how much control you have over design of this distributed system, it might be better for its architecture if you take a step back and try to think outside the domain of solutions you have narrowed the problem down to so far. You have identified the "main problem" to be finding a way for the distributed services to communicate with each other through the common database. Maybe that is a thought you should challenge.
There are many potential ways for these components to communicate and if your design goal is to reduce latency and thus avoid polling, it might in fact be the right way for the service that needs to be informed of completion of this work item to be informed of it right away. However, if in the future the throughput of this system has to increase, processing work items in bulk and instead poll for the information might become the only feasible option. This is also why I have chosen to word my answer a bit more generically and discuss the design of this distributed system more abstractly.
If after this consideration your answer remains the same and you do want immediate notification, consider having the component that processes a work item to notify the component(s) that need to be notified. As a general design principle for distributed systems, it is best to have the component that is most authoritative for a given set of data to also be the component to answer requests about that data. In this case, the data you have is the completion status of your work items, so the best component to act on this would be the component completing the work items. It might be better for that component to inform calling clients and components of that completion. Here it's also important to know if you only write this data to the database for the sake of communication between components or if those rows have any value beyond the completion of a given work item, such as for reporting purposes or performance indicators (KPIs).
I think there can be valid reasons, though, why you would not want to have such a call, such as reducing coupling between components or lack of access to communicate with the other component in a direct manner. There are many communication primitives that allow such notification, such as MSMQ under Windows, or Queues in Windows Azure. There are also reasons against it, such as dependency on a third component for communication within your system, which could reduce the availability of your system and lead to outages. The questions you might want to ask yourself here are: "How much work can my component do when everything around it goes down?" and "What are my design priorities for this system in terms of reliability and availability?"
So I think the main problem you might want to really try to solve fist is a bit more abstract: how should the interface through which components of this distributed system communicate look like?
If after all of this you remain set on having the interface of communication between those components be the SQL database, you could explore using INSERT and UPDATE triggers in SQL. You can easily look up the syntax of those commands and specify Stored Procedures that then get executed. In those stored procedures you would want to check the completion flag of any new rows and possibly restrain the number of rows you check by date or have an ID for the last processed work item. To then notify the other component, you could go as far as using the built-in stored procedure XP_cmdshell to execute command lines under Windows. The command you execute could be a simple tool that pings your service for completion of the task.
I'm sorry to have initially overlooked your suggestion to use SQL Query Notifications. That is also a feasible way and works through the Service Broker component. You would define a SqlCommand, as if normally querying your database, pass this to an instance of SqlDependency and then subscribe to the event called OnChange. Once you execute the SqlCommand, you should get calls to the event handler you added to OnChange.
I am not sure, however, how to get the exact changes to the database out of the SqlNotificationEventArgs object that will be passed to your event handler, so your query might need to be specific enough for the application to tell that the work item has completed whenever the query changes, or you might have to do another round-trip to the database from your application every time you are notified to be able to tell what exactly has changed.
Are you referring to a Message Queue? The .Net framework already provides this facility. I would say let the web service manage an application level queue. The robot will request the same web service for things to do. Assuming that the data needed for the jobs are small, you can keep the whole thing in memory. I would rather not involve a database, if you don't already have one.
Scenario: I want to let multiple (2 to 20, probably) server applications use a single database using ADO.NET. I want individual applications to be able to take ownership of sets of records in the database, hold them in memory (for speed) in DataSets, respond to client requests on the data, perform updates, and prevent other applications from updating those records until ownership has been relinquished.
I'm new to ADO.NET, but it seems like this should be possible using transactions with Data Adapters (ADO.NET disconnected layer).
Question part 1: Is that the right way to try and do this?
Question part 2: If that is the right way, can anyone point me at any tutorials or examples of this kind of approach (in C#)?
Question part 3: If I want to be able to take ownership of individual records and release them independently, am I going to need a separate transaction for each record, and by extension a separate DataAdapter and DataSet to hold each record, or is there a better way to do that? Each application will likely hold ownership of thousands of records simultaneously.
How long were you thinking of keeping the transaction open for?
How many concurrent users are you going to support?
These are two of the questions you need to ask yourself. If the answer for the former is a "long time" and the answer to the latter is "many" then the approach will probably run into problems.
So, my answer to question one is: no, it's probably not the right approach.
If you take the transactional lock approach then you are going to limit your scalability and response times. You could also run into database errors. e.g. SQL Server (assuming you are using SQL Server) can be very greedy with locks and could lock more resources than you request/expect. The application could request some row level locks to lock the records that it "owns" however SQL Server could escalate those row locks to a table lock. This would block and could result in timeouts or perhaps deadlocks.
I think the best way to meet the requirements as you've stated them is to write a lock manager/record checkout system. Martin Fowler calls this a Pessimistic Offline Lock.
UPDATE
If you are using SQL Server 2008 you can set the lock escalation behavior on a table level:
ALTER TABLE T1 SET (LOCK_ESCALATION = DISABLE);
This will disable lock escalation in "most" situations and may help you.
You actually need concurrency control,along with Transaction support.
Transaction only come into picture when you perform multiple operations on database. As soon as the connection is released the transaction is no more applicable.
concurrency lets you work with multiple updates on the same data. If two or more clients hold the same set of data and one needs to read/write the data after another client updates it, the concurrency will let you decide which set of updates to keep and which one to ignore. Mentioning the concept of concurrency is beyond the scope of this article. Checkout this article for more information.
I have a table with a heavy load(many inserts/updates/deletes) in a SQL2005 database. I'd like to do some post processing for all these changes in as close to real time as possible(asynchronously so as not to lock the table in any way). I've looked a number of possible solutions but just can't seem to find that one neat solution that feels right.
The kind of post processing is fairly heavy as well, so much so that the windows listener service is actually going to pass the processing over to a number of machines. However this part of the application is already up and running, completetly asynchronous, and not what I need help with - I just wanted to mention this simply because it affects the design decision in that we couldn't just load up some CLR object in the DB to complete the processing.
So, The simple problem remains: data changes in a table, I want to do some processing in c# code on a remote server.
At present we've come up with using a sql trigger, which executes "xp_cmdshell" to lauch an exe which raises an event which the windows service is listening for. This just feels bad.
However, other solutions I've looked at online feel rather convoluted too. For instance setting up SQLCacheDependancy also involves having to setup Service broker. Another possible solution is to use a CLR trigger, which can call a webservice, but this has so many warnings online about it being a bad way to go about it, especially when performance is critical.
Idealy we wouldn't depnd on the table changes but would rather intercept the call inside our application and notify the service from there, unfortunately though we have some legacy applications making changes to the data too, and monitoring the table is the only centralised place at the moment.
Any help would be most appreciated.
Summary:
Need to respond to table data changes in real time
Performance is critical
High volume of traffic is expected
Polling and scheduled tasks are not an option(or real time)
Implementing service broker too big (but might be only solution?)
CLR code is not yet ruled out, but needs to be perfomant if suggested
Listener / monitor may be remote machine(likely to be same phyisical network)
You really don't have that many ways to detect changes in SQL 2005. You already listed most of them.
Query Notifications. This is the technology that powers SqlDependency and its derivatives, you can read more details on The Mysterious Notification. But QN is designed to invalidate results, not to pro-actively notify change content. You will only know that the table has changes, without knowing what changed. On a busy system this will not work, as the notifications will come pretty much continously.
Log reading. This is what transactional replication uses and is the least intrusive way to detect changes. Unfortunately is only available to internal components. Even if you manage to understand the log format, the problem is that you need support from the engine to mark the log as 'in use' until you read it, or it may be overwritten. Only transactional replication can do this sort of special marking.
Data compare. Rely on timestamp columns to detect changes. Is also pull based, quite intrussive and has problems detecting deletes.
Application Layer. This is the best option in theory, unless there are changes occuring to the data outside the scope of the application, in which case it crumbles. In practice there are always changes occuring outside the scope of the application.
Triggers. Ultimately, this is the only viable option. All change mechanisms based on triggers work the same way, they queue up the change notification to a component that monitors the queue.
There are always suggestions to do a tightly coupled, synchronous notification (via xp_cmdshell, xp_olecreate, CLR, notify with WCF, you name it), but all these schemes fail in practice because they are fundamentally flawed:
- they do not account for transaction consistency and rollbacks
- they introduce availability dependencies (the OLTP system cannot proceed unless the notified component is online)
- they perform horribly as each DML operation has to wait for an RPC call of some form to complete
If the triggers do not actually actively notify the listeners, but only queue up the notifications, there is a problem in monitoring the notifications queue (when I say 'queue', I mean any table that acts as a queue). Monitoring implies pulling for new entries in the queue, which means balancing the frequency of checks correctly with the load of changes, and reacting to load spikes. This is not trivial at all, actually is very difficult. However, there is one statement in SQL server that has the semantics to block, without pulling, until changes become available: WAITFOR(RECEIVE). That means Service Broker. You mentioned SSB several times in your post, but you are, rightfuly so, scared of deploying it because of the big unknown. But the reality is that it is, by far, the best fit for the task you described.
You do not have to deploy a full SSB architecture, where the notificaition is delivered all the way to the remote service (that would require a remote SQL instance anyway, even an Express one). All you need to accomplice is to decouple the moment when the change is detected (the DML trigger) from the moment when the notification is delivered (after the change is commited). For this all you need is a local SSB queue and service. In the trigger you SEND a change notification to the local service. After the original DML transaction commits, the service procedure activates and delivers the notification, using CLR for instance. You can see an example of something similar to this at Asynchronous T-SQL.
If you go down that path there are some tricks you'll need to learn to achieve high troughput and you must understant the concept of ordered delivery of messages in SSB. I reommend you read these links:
Reusing Conversations
Writing Service Broker Procedures
SQL Connections 2007 Demo
About the means to detect changes, SQL 2008 apparently adds new options: Change Data Capture and Change Tracking. I emphasizes 'apparently', since they are not really new technologies. CDC uses log reader and is based on the existing Transactional replication mechanisms. CT uses triggers and is very similar to existing Merge replication mechanisms. They are both intended for occasionally connected systems that need to sync up and hence not appropiate for real-time change notification. They can populate the change tables, but you are left with the task to monitor these tables for changes, which is exactly from where you started.
This could be done in many ways. below method is simple since you dont want to use CLR triggers and sqlcmd options.
Instead of using CLR triggers you can create the normal insert trigger which updates the dedicated tracking table on each insert.
And develop dedicated window service which actively polls on the tracking table and update the remote service if there is any change in the data and set the status in tracking table to done (so it wont be picked again)..
EDIT:
I think Microsoft sync services for ADO.Net can work for you. Check out the below links. It may help you
How to: Use SQL Server Change Tracking - sql server 2008
Use a Custom Change Tracking System - below sql server 2008
In similar circumstances we are using CLR trigger that is writing messages to the queue (MSMQ). Service written in C# is monitoring the queue and doing post-processing.
In our case it is all done on the same server, but you can send those messages directly to the remote queue, on a different machine, totally bypassing "local listener".
The code called from trigger looks like this:
public static void SendMsmqMessage(string queueName, string data)
{
//Define the queue path based on the input parameter.
string QueuePath = String.Format(".\\private$\\{0}", queueName);
try
{
if (!MessageQueue.Exists(QueuePath))
MessageQueue.Create(QueuePath);
//Open the queue with the Send access mode
MessageQueue MSMQueue = new MessageQueue(QueuePath, QueueAccessMode.Send);
//Define the queue message formatting and create message
BinaryMessageFormatter MessageFormatter = new BinaryMessageFormatter();
Message MSMQMessage = new Message(data, MessageFormatter);
MSMQueue.Send(MSMQMessage);
}
catch (Exception x)
{
// async logging: gotta return from the trigger ASAP
System.Threading.ThreadPool.QueueUserWorkItem(new WaitCallback(LogException), x);
}
}
Since you said there're many inserts running on that table, a batch processing could fit better.
Why did just create a scheduled job, which handle new data identified by a flag column, and process data in large chunks?
Use the typical trigger to fire a CLR on the database. This CLR will only start a program remotely using the Win32_Process Class:
http://motevich.blogspot.com/2007/11/execute-program-on-remote-computer.html