I have a network server which has a share that exposes a set of files. These files are consumed by processes that are running on multiple servers, sometimes several processes on the same machine.
The set of files are updated a couple times a day, and the set of files are fairly large.
We are attempting to reduce the bandwidth used by these processes retrieving these filesets by making processes that are on the same machine share the same fileset.
In order to do this, we want each process on the same machine to coordinate with the other processes that need the same files so that only one will attempt to download the files, and then the files will be shared by all the processes once complete.
Additionally, we need to prevent the server from performing an update on the fileset while a download is in progress.
In order to facilitate this requirement, I created a file lock class. This class opens a file called .lock in the specified location. The file is opened as read/write so that it will prevent another process from doing the same, regardless of what machine the process is running on. This is enclosed in a try/catch so that if the file is already locked, the exception is caught and the lock is not acquired. This already works correctly.
The problem I am trying to solve is that if a process hangs for some reason while it has the lock, all the other processes will indefinitely fail to sync these files because they cannot acquire the lock.
One solution we were exploring today was to have a multi-lock setup, where each lock would have a guid in the name, and instead of fighting over a single hard lock, locks could be acquired as many as requested. However, processes would be responsible for making sure there is only one lock set when they begin a download. This is so that if a process with a lock hangs, we can consider it expired after a certain time limit, and nothing prevents a new process from requesting a lock in addition to the hung lock.
The problem here is that the creation of these multi locks needs to be synchronized between processes or else there could be a race condition on the creation and checking of the lock count.
I don't see a way to synchronize this without reintroducing a hard locking mechanism like the first solution, but then we are right back where we started where a hung process will block the others from doing a download.
Any suggestions?
A common way to tackle this is to use some sort of shareable lock file, with the real locking logic performed via the content. For example, consider an SQlite database file with a single table as a lock file: Something like
CREATE TABLE lock (
id INTEGER PRIMARY KEY AUTOINCREMENT,
host TEXT,
pid INTEGER,
expires INTEGER
)
A consumer (or the producer for an update to the fileset) requests a lock by INSERTing into the table
A process heartbeats by UPDATEing its own row, making it never-expiring
expired rows are discarded: Crashed processes will stop updating and eventually their lock will be discarded
The lowest id holds the lock
Processes on the same host may evaluate the host field to find out, if another process on the same host already wants to copy, making it obsolete to request another copy
Ofcourse this can be done via a database server (or in fact locking server) instead of a database file if feasable, but the SQlite method has the advantage of requiring nothing more than file access.
The trick here is good use of caching.
The designated "download" process that updates the fileset should first grab it from the remote location and store it in a temp file. Then it should simply continue attempting to acquire a read/write lock on the local file(s) you want to replace. When it succeeds, do the swap and drop the lock. This part should go very very quickly.
Also, it is quite unlikely to "hang" when doing a simple file copy on a local drive. Meaning the other dependent processes will be able to continue functioning regardless of what happens with this one.
To make sure the downloading process is functioning correctly you'll need a monitoring program that pings the download process every so often to ensure it's responsive. If it's not then alert someone..
Related
We have a process that needs to run every so soften against a DB used by a web app, and we need to prevent all other updates during this process execution. Is there any global way to do this maybe thru nHibernate, .NET or maybe directly in Oracle?
The original idea was to have a one-record DB table to indicate if the process is running or not, but with this we will need to go back to every single save/update method and make changes to verify if this record exist or not prior to the save/update call.
My reaction to that kind of requirement is to review the design as it is highly unusual outside of doing application upgrades. Other than that there are a couple option:
Shutdown the DB, open it in exclusive mode, make changes, and then open it up for everyone.
Attempt to lock all the required tables with LOCK TABLE. That might generate deadlock exceptions depending on the order of doing the locks.
I'm still relatively new to TPL Dataflow, and not 100% sure if I am using it correctly or if I'm even suppose to use it.
I'm trying to employ this library to help out with file-copying+file-upload.
Basically the structure/process of handling files in our application is as follows:
1) The end user will select a bunch of files from their disk and choose to import them into our system.
2) Some files have higher priority, while the others can complete at their own pace.
3) When a bunch of files is imported here is the process:
Queue these import requests, one request maps to one file
These requests are stored into a local sqlite db
These requests also explicitly indicate if it demands higher priority or not
We currently have two active threads running (one to manage higher priority and one for lower)
They go into a waiting state until signalled.
When new requests come in, they get signalled to dig into the local db to process the requests.
Both threads are responsible for copying the file to a separate cached location, so just a simple File.Copy call. The difference is, one thread does the actual File.Copy call immediately. While the other thread just enqueues them all onto the ThreadPool to run.
-Once the files are copied, the request gets updated, the request has a Status enum property that has different states like Copying, Copied, etc.
The request also requires a ServerTimestamp to be set, the ServerTimestamp is important, because there are times where a user may be saving changes to a file that's essentially the same, but has different versions, so the order is important.
Another separate thread is running that gets signalled to fetch requests from the local DB where the status is Copied. It will then ping an endpoint to ask for a ServerTimestamp, and update the request with it
Lastly once the request has had the file copy complete and the server ticket is set, we can now upload the physical file to the server.
So I'm toying around with using TransformBlock's
1- File Copy TransformBLock
I'm thinking there could be two File Copy TransformBlock's one that's for higher priority and one for lower priority.
My understanding is that it uses the TaskScheduler.Current which uses the ThreadPool behind the scenes. I was thinking maybe a custom TaskScheduler that spawns a new thread on the fly. This scheduler can be used for the higher priority file copy block.
2- ServerTimestamp TransformBlock
So this one will be linked to the 1st block, and take in all the copied files in and get the server timestamp and set it int he request.
3-UploadFIle TransformBlock
This will upload the file
Problems I'm facing:
Say for example we have 5 file requests enqueued in the local db.
File1
File2
File3-v1
File3-v2
File3-v3
We Post/SendAsync all 5 requests to the first block.
If File1,File2,File3-v1,File3-v3 are successful but File3-v2 fails, I kind of want the block to not flow onto the next ServerTimestamp block, because it's important the File3 versions are completely copied before proceeding, or else it will go out of order.
But this kind of leads onto how is it going to retry correctly and have the other 4 files that had already been copied move with it over to the next block?
I'm not sure if I am structuring this correctly or if TPL Dataflow supports my usecase.
I want to rename database file and even I use using with connection every time I have to call:
FirebirdSql.Data.FirebirdClient.FbConnection.ClearAllPools();
The problem is that this method doesn't block the thread and I don't know how to check if all connections are cleared, because if I get value from:
FirebirdSql.Data.FirebirdClient.FbConnection.ConnectionPoolsCount
It is zero immediately after the method, but I am still not able to rename the database file. If I set some timeout after the method (I tried 1s) then the file is not locked and I can rename it. The problem is that this timeout could be certainly different on different machines.
FWIK the only other method how to check if the file is not locked is to try the renaming in the loop with some timeout, but I can not be sure if the lock is made by connections from my application or from somewhere else.
So is there a better way, how I can wait until this method clears the connections?
Making it an answer for the sake of formatting lists.
#Artholl you can not safely rely upon your own disconnection for a bunch of reasons.
There may be other programs connected, not only this your running program. And unless you connect with SYSDBA or database creator or RDB$ADMIN role - you can not query if there are other connections now. However, you can query, from MON$ATTACHMENTS, the connections made with the same user as your CURRENT_CONNECTION. This might help you to check the state of your application's own pool. Just that there is little practical value in it.
in Firebird 3 in SuperServer mode there is the LINGER parameter, it means that server would keep the database open for some time after the last client disconnects, expecting that if some new client might decide to connect again then the PAGE CACHE for DB file is already in place. Like for middle-loaded WWW servers.
even in Firebird 2 every open database has some caches, and it would be installation-specific (firebird.conf) and database specific (gfix/gstat) how large the caches are. After the engine seeing all clients disconnected decided the database is to be closed - it starts with flushing the caches and demanding OS to flush their caches too ( there is no general hardware-independent way to demand RAID controllers and disks themselves to flush caches, or Firebird would try to make it too ). By default Firebird caches are small and preempting them to hardware layer should be fast, but still it is not instant.
Even if you checked that all other clients did disconnected, and then you disconnected yourself, and then you correctly guessed how long to wait for Linger and Caches, even then you still are not safe. You are subject to race conditions. At the very time you start doing something requiring explicit owning of DB there may happen some new client that would concurrently open his new connection.
So the correct approach would be not merely proving there is no database connection right NOW, but also ensuring there CAN NOT be any new connection in future, until you re-enable it.
So, as Mark said above, you have to use Shutdown methods to bring the database into no-connections-allowed state. And after you've done with file renaming and other manipulations - to switch it back to normal mode.
https://www.firebirdsql.org/file/documentation/reference_manuals/user_manuals/html/gfix-dbstartstop.html
If I was responsible for maintaining the firebird provider, I wouldn't want users to rely on such functionality.
Other applications could have the file open (you're only in control of connection pools in the current AppDomain), and the server might be running some kind of maintenance on the database.
So even if you can wait for the pools to be cleared, I'd argue that if you really really have to mess with these files, a more robust solution is to stop the firebird service instead (and wait for it to have fully stopped).
I currently have a c# console app where multiple instances run at the same time. The app accesses values in a database and processes them. While a row is being processed it becomes flagged so that no other instance attempts to process it at the same time. My question is what is a efficient and graceful way to unflag those values in the event an instance of the program crashes? So if an instance crashed I would only want to unflag those values currently being processed by that instance of the program.
Thanks
The potential solution will depend heavily on how you start the console applications.
In our case, the applications are started based on configuration records in the database. When one of these applications performs a lock, it uses the primary key from the database configuration record to perform the lock.
When the application starts up, the first thing it does is release all locks on the records that it previously locked.
To control all of the child processes, we have a service that uses the information from the configuration tables to start the processes and then keeps an eye on them, restarting them when they fail.
Each of the processes is also responsible for updating a status table in the database with the last time it was available with a maximum allowed delay of 2 minutes (for heavy processing). This status table is used by sysadmins to watch for problems, but it could also be used to manually release locks in case of a repeating failure in a given process.
If you don't have a structured approach like this, it could be very difficult to automatically unlock records unless you have a solid profile of your application performance that would allow you to know that any lock over 5 minutes old is invalid because it should only take, on average, 15 seconds to process a record with a maximum of 2 minutes.
To be able to handle any kind of crash, even power off I would suggest to timestamp records additionally and after some reasonable timeout treat records as unlocked even if they are flagged.
So I am creating a customer indexing program for my company and I have basically everything coded and working except I want the indexing program to watch the user-specified indexed directories and update the underlying data store on the fly to help eliminate the need for full indexing often.
I coded everything in WPF/C# with an underlying SQLite database and I am sure the folder watchers would work well under "non-heavy loads", but the problem is we use TortoiseSVN and when the user does an SVN Update, etc. that creates a heavy file load the FileSystemWatcher and SQLite updates just can't keep up (even with the max buffer size). Basically I am doing a database insert every time the watcher event is hit.
So my main question is...does anyone have suggestions on how to implement this file watcher to handle such heavy loads?
Some thoughts I had were: (1) Create a staging collection for all the queries and use a timer and thread to insert the data later on (2) Write the queries to a file and use a timer thread later on for the insert
Help....
You want to buffer the data received from your file watch events in memory. Thus when receiving the events from your registered file watchers, you can accumulate them in memory as fast as possible for a burst of activity. Then on a separate process or thread, you read them from your in memory buffer and do whatever you need for persistent storage or whatever process is more time intensive.
You can use a Queue to queue in all the requests. I have made good experience with the MS MessageQueue, which comes out of the box and is quite easy to use.
see http://www.c-sharpcorner.com/UploadFile/rajkpt/101262007012217AM/1.aspx
Then have a separate WorkerThread which grabs a predefined number of elements from the queue and inserts them to the database. Here I'd suggest to merge the single inserts to a bulkinsert.
If you want to be 100% sure you can check for cpu and IO performance before making the inserts.
Here a code snippet for determining the cpu utilization:
Process.TotalProcessorTime.TotalMilliseconds/Environment.ProcessorCount
The easiest is to have the update kick off a timer (say, one minute). If another update comes in in the meantime, you queue the change and restart the timer. Only when a minute has gone by without activity do you start processing.