CQRS run handlers in specific order - c#

I have a running order for 2 handlers Deleting and Reordering pictures and would like some advises for the best solution.
On the UI some pictures are deleted, the user clicks on the deleted button. The whole flow, delete command up to an event handler which actually deletes the physical files is started.
Then immediately the user sorts the remaining pictures. A new flow from reorder command up to the reordering event handler for the file system fires again.
Already there is a concurrency problem. The reordering cannot be correctly applied without having the deletion done. At the moment this problem is handled with some sort of lock. A temp file is created and then deleted at the end of the deletion flow. While that file exists the other thread (reordering or deletion depending on the user actions) awaits.
This is not an ideal solution and would like to change it.
The potential solution must be also pretty fast (off course the current one is not a fast one) as the UI is updated thru a JSON call at the end of ordering.
In a later implementation we are thinking to use a queue of events but for the moment we are pretty stuck.
Any idea would be appreciated!
Thank you, mosu'!
Edit:
Other eventual consistency problems that we had were solved by using a Javascript data manager on the client side. Basically being optimist and tricking the user! :)
I'm starting to believe this is the way to go here as well. But then how would I know when is the data changed in the file system?

Max suggestions are very welcomed and normally they apply.
It is hard sometimes to explain all the details of an implementation but there is a detail that should be mentioned:
The way we store the pictures means that when reordered all pictures paths (and thus all links) change.
A colleague hat the very good idea of simply remove this part. That means that even if the order will change the path of the picture will remain the same. On the UI side there will be a mapping between the picture index in the display order and its path and this means there is no need to change the file system anymore, except when deleting.
As we want to be as permissive as possible with our users this is the best solution for us.
I think, in general, it is also a good approach when there appears to be a concurrency issue. Can the concurrency be removed?

Here is one thought on this.
What exactly you are reordering? Pictures? Based on, say, date.
Why there is command for this? The result of this command going to be seen by everyone or just this particular user?
I can only guess, but it looks like you've got a presentation question here. There is no need to store pictures in some order on the write side, it's just a list of names and links to the file storage. What you should do is to store just a little field somewhere in the user settings or collection settings: Date ascending or Name descending. So you command Reorder should change only this little field. Then when you are loading the gallery this field should be read first and based on this you should load one or another view. Since the store is cheap nowadays, you can store differently sorted collections on the read side for every sort params you need.
To sum up, Delete command is changing the collection on the write side, but Reoder command is just user or collection setting. Hence, there is no concurrency here.
Update
Based on your comments and clarifications.
Of course, you can and, probably, should restrict user actions only by one at the time. If time of deletion and reordering is reasonably short. It's always a question of type of user experience you are asked to achieve. Take a usual example of ordering system. After an order placed, user can almost immediately see it in the UI and the status will be something like InProcess. Most likely you won't let user to change the order in any way, which means you are not going to show any user controls like Cancel button(of course this is just an example). Hence, you can use this approach here.
If 2 users can modify the same physical collection, you have no choice here - you are working with shared data and there should be kind of synchronization. For instance, if you are using sagas, there can be a couple of sagas: Collection reordering saga and Deletion saga - they can cooperate. Deletion process started first - collection aggregate was marked as deletion in progress and then right after this reordering saga started, it will attempt to start the reordering process, but since deletion saga is inprocess, it should wait for DeletedEvent and continue the process afterwards.The same if Reordering operation started first - the Deletion saga should wait until some event and continue after that event arrived.
Update
Ok, if we agreed not touch the file system itself, but the aggregate which represents the picture collection. The most important concurrency issues can be solved with optimistic concurrency approach - in the data storage a unique constraint, based on aggregate id and aggregate version, is usually used.
Here are the typical steps in the command handler:
This is the common sequence of steps a command handler follows:
Validate the command on its own merits.
Load the aggregate.
Validate the command on the current state of the aggregate.
Create a new event, apply the event to the aggregate in memory.
Attempt to persist the aggregate. If there's a concurrency conflict during this step, either give up, or retry things from step 2.
Here is the link which helped me a lot some time ago: http://www.cqrs.nu/

Related

Listening events in a web service or API over Database changes

I have this scenario, and I don't really know where to start. Suppose there's a Web service-like app (might be API tho) hosted on a server. That app receives a request to proccess some data (through some method we will call processData(data theData)).
On the other side, there's a robot (might be installed on the same server) that procceses the data. So, The web-service inserts the request on a common Database (both programms have access to it), and it's supposed to wait for that row to change and send the results back.
The robot periodically check the database for new rows, proccesses the data and set some sort of flag to that row, indicating that the data was processed.
So the main problem here is, what should the method proccessData(..) do to check for the changes of the data row?.
I know one way to do it: I can build an iteration block that checks for the row every x secs. But i don't want to do that. What I want to do is to build some sort of event listener, that triggers when the row changes. I know it might involve some asynchronous programming
I might be dreaming, but is that even possible in a web enviroment.?
I've been reading about a SqlDependency class, Async and AWait classes, etc..
Depending on how much control you have over design of this distributed system, it might be better for its architecture if you take a step back and try to think outside the domain of solutions you have narrowed the problem down to so far. You have identified the "main problem" to be finding a way for the distributed services to communicate with each other through the common database. Maybe that is a thought you should challenge.
There are many potential ways for these components to communicate and if your design goal is to reduce latency and thus avoid polling, it might in fact be the right way for the service that needs to be informed of completion of this work item to be informed of it right away. However, if in the future the throughput of this system has to increase, processing work items in bulk and instead poll for the information might become the only feasible option. This is also why I have chosen to word my answer a bit more generically and discuss the design of this distributed system more abstractly.
If after this consideration your answer remains the same and you do want immediate notification, consider having the component that processes a work item to notify the component(s) that need to be notified. As a general design principle for distributed systems, it is best to have the component that is most authoritative for a given set of data to also be the component to answer requests about that data. In this case, the data you have is the completion status of your work items, so the best component to act on this would be the component completing the work items. It might be better for that component to inform calling clients and components of that completion. Here it's also important to know if you only write this data to the database for the sake of communication between components or if those rows have any value beyond the completion of a given work item, such as for reporting purposes or performance indicators (KPIs).
I think there can be valid reasons, though, why you would not want to have such a call, such as reducing coupling between components or lack of access to communicate with the other component in a direct manner. There are many communication primitives that allow such notification, such as MSMQ under Windows, or Queues in Windows Azure. There are also reasons against it, such as dependency on a third component for communication within your system, which could reduce the availability of your system and lead to outages. The questions you might want to ask yourself here are: "How much work can my component do when everything around it goes down?" and "What are my design priorities for this system in terms of reliability and availability?"
So I think the main problem you might want to really try to solve fist is a bit more abstract: how should the interface through which components of this distributed system communicate look like?
If after all of this you remain set on having the interface of communication between those components be the SQL database, you could explore using INSERT and UPDATE triggers in SQL. You can easily look up the syntax of those commands and specify Stored Procedures that then get executed. In those stored procedures you would want to check the completion flag of any new rows and possibly restrain the number of rows you check by date or have an ID for the last processed work item. To then notify the other component, you could go as far as using the built-in stored procedure XP_cmdshell to execute command lines under Windows. The command you execute could be a simple tool that pings your service for completion of the task.
I'm sorry to have initially overlooked your suggestion to use SQL Query Notifications. That is also a feasible way and works through the Service Broker component. You would define a SqlCommand, as if normally querying your database, pass this to an instance of SqlDependency and then subscribe to the event called OnChange. Once you execute the SqlCommand, you should get calls to the event handler you added to OnChange.
I am not sure, however, how to get the exact changes to the database out of the SqlNotificationEventArgs object that will be passed to your event handler, so your query might need to be specific enough for the application to tell that the work item has completed whenever the query changes, or you might have to do another round-trip to the database from your application every time you are notified to be able to tell what exactly has changed.
Are you referring to a Message Queue? The .Net framework already provides this facility. I would say let the web service manage an application level queue. The robot will request the same web service for things to do. Assuming that the data needed for the jobs are small, you can keep the whole thing in memory. I would rather not involve a database, if you don't already have one.

How to know which plug-in/workflow/entity updates data

In my plugin I have a code to check the execution context Depth to avoid infinite loop once the plugin updates itself/entity, but there are cases that entity is being updated from other plugin or workflow with depth 2,3 or 4 and for that specific calls, from that specific plugin I want to process the call and not stop even if the Depth is bigger then 1.
Perhaps a different approach might be better? I've never needed to consider Depth in my plug-ins. I've heard of other people doing the same as you (checking the depth to avoid code from running twice or more) but I usually avoid this by making any changes to the underlying entity in the Pre Operation stage.
If, for example, I have code that changes the name of an Opportunity whenever the opportunity is updated, by putting my code in the post-operation stage of the Update my code would react to the user changing a value by sending a separate Update request back to the platform to apply the change. This new Update itself causes my plug-in to fire again - infinite loop.
If I put my logic in the Pre-Operation stage, I do it differently: the user's change fires the plugin. Before the user's change is committed to the platform, my code is invoked. This time I can look at the Target that was sent in the InputParameters to the Update message. If the name attribute does not exist in the Target (i.e. it wasn't updated) then I can append an attribute called name with the desired value to the Target and this value will get carried through to the platform. In other words, I am injecting my value into the record before it is committed, thereby avoiding the need to issue another Update request. Consequently, my changes causes no further plug-ins to fire.
Obviously I presume that your scenario is more complex but I'd be very surprised if it couldn't fit the same pattern.
I'll start by agreeing with everything that Greg said above - if possible refactor the design to avoid this situation.
If that is not possible you will need to use the IPluginExecutionContext.SharedVariables to communicate between the plug-ins.
Check for a SharedVariable at the start of your plug-in and then set/update it as appropriate. The specific design you'll use will vary based on the complexity you need to manage. I always get use a string with the message and entity ID - easy enough to serialize and deserialize. Then I always know whether I'm already executing the against a certain message for a specific record or not.

Iterating through Event Log Entry Collection, IndexOutOutOfBoundsException

in a service application I am iterating through the Windows application event log to parse Events in order react depanding on the entry message.
In the case that the event log is full (Windows usually makes sure there is enough space by deleting old entries - this is configurable in the eventvwr.exe settings), the service always runs into an IndexOutOfBoundsException while iterating through the EventLog.Entries collection. No matter how I iterate (for-loop, using the collections enumerator, copying the collection into an array, ...), I can't seem to get rid of this ´bug´.
Currently, I ensure that the log is not full in order to keep the service running by regularly deleting the last few item by parsing the event log file and deleting the last few nodes (Don't beat me up, I couldn't find a better alternative...).
How can I iterate through the collection without trying to access already deleted entries?
Is there probably a more elegant method? I am only trying to acces the logs written during the last x seconds (even LINQ failed to select those when the log is full - same exception), could this help?
Thanks for any advice and hints
Frank
Edit: I forgot to mention that my assumption is the loops are accessing entries which are being deleted during iteration by Windows. Basically that is why I tried to clone the collection. Is there perhaps a way to lock the collection for a small amount of time for just my application?
I have hit this as well, more so on 2008R2 Domain Controllers. The problem is the logs are wrapping, and so the index seems to change between when you started iterating the events and when you got to this point.
Doesn't seem to be a cure other than retry.
Looking at it from a practical point of view, why is there a problem at all?
If you want to iterate over all the entries, and sometimes when you try to read an entry that doesn't really exist you get an IndexOutOfBoundsException, then just catch this exception and ignore it.
If you know what this exception means, and you know what you want to do, just handle the exception and continue working. That's what exceptions are for, after all...
In case anyone finds this thread:
Avoiding this behavior doesn't seem to be possible. Even copying the collection fails and locking the file is not possible (due to system restrictions).
Instead, I implemented a periodic checking algorithm which backs up the event log and clears it at a defined usage percentage (e.g. 95%), such that an overflow or deletion should not happen.

Tracking open pages with ASP.Net

I'm sure that this question has already been asked, but I don't really see it.
Using asp.net and C#, how does one track the pages that are open/closed?
I have tried all sorts of things, including:
modifying the global.asax file application/session start/end operations
setting a page's destructor to report back to the application
static variables (which persist globally rather than on a session by session basis)
Javascript window.onload and window.onbeforeunload event handlers
It's been educational, but so far no real solution has emerged.
The reason I want to do this is to prevent multiple users from modifying the same table at the same time. That is, I have a list of links to tables, and when a user clicks to modify a table, I would like to set that link to be locked so that NO USER can then modify that table. If the user closes the table modification page, I have no way to unlock the link to that table.
You should not worry about tracking pages open or closed. Once a webpage is rendered by IIS it's as good as "closed".
What you need to do is protect from two users updating your table at the same time by using locks...For example:
using (Mutex m = new Mutex(false, "Global\\TheNameOfTheMutex"))
{
// If you want to wait for 5 seconds for other page to finish,
// you can do m.WaitOne(TimeSpan.FromSeconds(5),false)
if (!m.WaitOne(TimeSpan.Zero, false))
Response.Write("Another Page is updating database.");
else
UpdateDatabase();
}
What this above snippet does is, it will not allow any other webpage to call on the UpdateDatabase method while another page is already runing the UpdateDatabase call.So no two pages can call updatedatabase at the same exact time.
But this does not protect the second user from running UpdateDatabase AFTER the first call has finished, so you need to make sure your UpdateDatabase method has proper checks in place ie. it does not allow stale data to be updated.
I think your going the wrong way about this...
You really should be handling your concurrency via your business layer / db and not relying on the interface because people can and will find a way around whatever you implement.
I would recommend storing a 'key' in your served up page everytime you serve up a page that can modify the table. The key is like a versioning stamp of the last time the table was updated. Send this key along with your update and validate that they match before doing the update. If they don't then you know someone else came along and modified that table and you should inform the user that there was a concurrency conflict, the data has changed, and do they want to see the new data.
You should not use page requests to lock database tables. That won't work well for many reasons. Each request creates a new page object, and there are multiple application contexts, which may be on multiple threads/processes, etc. Any of which may drop off the face of the earth at any point in time.
The highest level of tackling this issue is to find out why you need to lock the tables in the first place. One common way to avoid this is to accept all user table modifications and allow the users to resolve their conflicts.
If locking is absolutely necessary, you may do well with a lock table that is modified before and after changes. This table should have a way of expiring locks when users walk away without doing so.
Eg. See http://www.webcheatsheet.com/php/record_locking_in_web_applications.php It's for PHP but the concept is the same.

Preferred database/webapp concurrency design when multiple users can edit the same data

I have a ASP.NET C# business webapp that is used internally. One issue we are running into as we've grown is that the original design did not account for concurrency checking - so now multiple users are accessing the same data and overwriting other users changes. So my question is - for webapps do people usually use a pessimistic or optimistic concurrency system? What drives the preference to use one over another and what are some of the design considerations to take into account?
I'm currently leaning towards an optimistic concurrency check since it seems more forgiving, but I'm concerned about the potential for multiple changes being made that would be in contradiction to each other.
Thanks!
Optimistic locking.
Pessimistic is harder to implement and will give problems in a web environment. What action will release the lock, closing the browser? Leaving the session to time out? What about if they then do save their changes?
You don't specify which database you are using. MS SQL server has a timestamp datatype. It has nothing to do with time though. It is mearly a number that will get changed each time the row gets updated. You don't have to do anything to make sure it gets changed, you just need to check it. You can achive similar by using a date/time last modified as #KM suggests. But this means you have to remember to change it each time you update the row. If you use datetime you need to use a data type with sufficient precision to ensure that you can't end up with the value not changing when it should. For example, some one saves a row, then someone reads it, then another save happens but leaves the modified date/time unchanged. I would use timestamp unless there was a requirement to track last modified date on records.
To check it you can do as #KM suggests and include it in the update statement where clause. Or you can begin a transaction, check the timestamp, if all is well do the update, then commit the transaction, if not then return a failure code or error.
Holding transactions open (as suggested by #le dorfier) is similar to pessimistic locking, but the amount of data locked may be more than a row. Most RDBM's lock at the page level by default. You will also run into the same issues as with pessimistic locking.
You mention in your question that you are worried about conflicting updates. That is what the locking will prevent surely. Both optimistic or pessimistic will, when properly implemented prevent exactly that.
I agree with the first answer above, we try to use optimistic locking when the chance of collisions is fairly low. This can be easily implemented with a LastModifiedDate column or incrementing a Version column. If you are unsure about frequency of collisions, log occurrences somewhere so you can keep an eye on them. If your records are always in "edit" mode, having separate "view" and "edit" modes could help reduce collisions (assuming you reload data when entering edit mode).
If collisions are still high, pessimistic locking is more difficult to implement in web apps, but definitely possible. We have had good success with "leasing" records (locking with a timeout)... similar to that 2 minute warning you get when you buy tickets on TicketMaster. When a user goes into edit mode, we put a record into the "lock" table with a timeout of N minutes. Other users will see a message if they try to edit a record with an active lock. You could also implement a keep-alive for long forms by renewing the lease on any postback of the page, or even with an ajax timer. There is also no reason why you couldn't back this up with a standard optimistic lock mentioned above.
Many apps will need a combination of both approaches.
here's a simple solution to many people working on the same records.
when you load the data, get the last changed date, we use LastChgDate on our tables
when you save (update) the data add "AND LastChgDate=previouslyLoadedLastChgDate" to the where clause. If the row count=0 on the update, issue error where "someone else has already saved this data" and rollback everything, otherwise the data is saved.
I generally do the above logic on header tables only and not on the details tables, since they are all in one transaction.
I assume you're experiencing the 'lost update' problem.
To counter this as a rule of thumb I use pessimistic locking when the chances of a collision are high (or transactions are short lived) and optimistic locking when the chances of a collision are low (or transactions are long lived, or your business rules encompass multiple transactions).
You really need to see what applies to your situation and make a judgment call.

Categories

Resources