How can I make sure that a file uploaded through SFTP (in a Linux base system) stays locked during the transfer so an automated system will not read it?
Is there an option on the client side? Or server side?
SFTP protocol supports locking since version 5. See the specification.
You didn't specify, what SFTP server are you using. So I'm assuming the most widespread one, the OpenSSH. The OpenSSH supports SFTP version 3 only, so it does not support locking.
Anyway, even if your server supported file locking, most SFTP clients/libraries won't support SFTP version 5. Or even if they do, they won't support the locking feature. Note that the lock is explicit, the client has to request it.
There are some common workarounds for the problem:
As suggested by #user1717259, you can have the client upload a "done" file, once an upload finishes. Make your automated system wait for the "done" file to appear.
You can have a dedicated "upload" folder and have the client (atomically) move the uploaded file to a "done" folder. Make your automated system look to the "done" folder only.
Have a file naming convention for files being uploaded (".filepart") and have the client (atomically) rename the file after an upload to its final name. Make your automated system ignore the ".filepart" files.
See (my) article Locking files while uploading / Upload to temporary file name for example of implementing this approach.
Also, some SFTP servers have this functionality built-in. For example ProFTPD with its HiddenStores directive (courtesy of #fakedad).
A gross hack is to periodically check for file attributes (size and time) and consider the upload finished, if the attributes have not changed for some time interval.
You can also make use of the fact that some file formats have clear end-of-the-file marker (like XML or ZIP). So you know, when you download an incomplete file.
A typical way of solving this problem is to upload your real file, and then to upload an empty 'done.txt' file.
The automated system should wait for the appearance of the 'done' file before trying to read the real file.
A simple file locking mechanism for SFTP is to first upload a file to a directory (folder) where the read process isn't looking. You can "make" an alternate folder using the sftp> mkdir command. Upload the file to the alternate directory, instead of the ultimate destination directory. Once the SFTP> put command completes, then do a move like this:
SFTP> move alternate_path/filename destination_path/filename. Since the SFTP "move" is just switching the file pointers, it is atomic, so it is an effective lock.
Related
An ASP.NET application (running on Windows server/IIS 7) has to transfer big size files uploaded by current user to an external SFTP server. Due to the file size the idea is to do this asynchronously.
The idea is that the ASP.NET application stores the uploaded file on a local directory of the Windows server. The current user can continue his work. A Windows service or a Quartz job (other tools(*)/ideas?) is now responsible to transfer the file to the external SFTP server.
(*) Are there existing tools that listen on changes of a Windows directory and then move the files on a SFTP server (incl. handling communication errors/retries)?
If there is no existing solution, do you have had similar requirements? What do we have to consider? Because the connection to the SFTP server is not very stable we need an optimized error handling with auto retry functionality.
To watch for changes in a local directory in .NET, use
the FileSystemWatcher class.
If you are looking for an out of the box solution, use the keepuptodate command in WinSCP scripting.
A simple example of WinSCP script (e.g. watch.txt):
open sftp://username:password#host/
keepuptodate c:\local_folder_to_watch /remote_folder
exit
Run the script like:
winscp.com /script=watch.txt
Though this works only, if the uploaded files are preserved in the remote folder.
(I'm the author of WinSCP)
There are similar questions like this in stackoverflow but none of them fulfills my requirements.
I want that when user move a file to his folder in desktop, that file should be uploaded to a web server (i.e. a one way drobox feature).
Technically, I want a listener who can check when a file is dropped to a folder and trigger an uploading function.
p.s. Will prefer code or resources in .net.
You can use a FileSystemWatcher to watch the folder.
You should use a FileSystemWatcher to monitor the folder. Then you upload the changed files with one of the methods available on your web server.
I'm writing a multi threaded console application which downloads pdf files from the web and copies it locally on to our content Server location(windows server). This is also the same location from which the files will be served to our website.
I am skeptical about the approach, because of concurrrency issues such as if the user on the web site requests a pdf file from the content server, and at the same time the file is being written to or being updated by the console application there might be an IO Exception. (The application also makes updates to the pdf files if the original contents change over time)
Is there a way to control the concurrency issue?
You probably want your operations on creating and updating the files where they are served to be atomic, so that any other processes dealing with those files get the correct version, not the one that is still open for writing.
Instead of actually writing the files to where they will be served, you could write them to a temporary directory and then move them into the directory where they will be served from.
Similarly, for updating them, you should check that when your application is updating those pdfs that the files themselves are not changed until writing has finished. You could test this by making your application sleep after it has started writing to the file, for example.
The details depend on which web server software you are using, but the key to this problem is to give each version of the file a different name. The same URL, mind you, but a different name on the underlying file system.
Once a newer version of the file is ready, change the web server's configuration so that the URL points to the new file. In any reasonably functional web server this should be an atomic operation.
If the web server doesn't have built-in support for this, you could serve the files via a custom server-side script.
Mark the files hidden until the copy or update is complete.
I need to write a utility in C#. The utility has to invoke a web-service once a file has been uploaded via FTP. The files are text files (so they dont have an end-of-file marker and they can be pretty big).
The ftp server is the built in ftp server within Windows.
My question is: How do I determine whether the file upload has been completed? (so that I can call the web-service and tell it about the file?) If I dont wait to find out the file has been uploaded, then I might end up notifying the web-service prematurely (especially for really large files)
Have your process upload the file to a temporary directory and execute a move command to the destination directory.
This way you know any files in your destination directory are complete.
I want to write a small c# ftp client class library which basically needs to transfer files to a ftp location.
What I want, is a 100% foolproof code, where I can get some sort of acknowledgement that the ftp file transfer has been either 100% successful or failed.
No resume support is required.
Good to have (but secondary):
Some sort of distributed transaction where only if the file transfer is successful for a file, i update my db for that particular file with 1 (true)... if it is failed, then db to be updated with 0 (false).
But suppose the ftp file transfer was successful, but for whatever reasons the db could not be updated, then the file over ftp should be deleted - I can easily do this using dirty c# code (where i manually try to delete the file if db update failed).
But what I am exactly looking for is file system based transaction over ftp... so both the file transfer as well as db update is not committed until both are successful (hence no need for manual delete).
Any clues?
Having had the "joy" of writing a FTP library myself here is my advice
1) Its NOT gonna be easy, because FTP servers return different return from the same command ( Like directory information,regular ftp commands and pretty much everything).
2) This is gonna take more time then you think
3) The dream about 100% foolproof transfer is not gonna happen, unless you control the FTP server and add a new FTP command so you can compare file hashes.
Pretty much if i was gonna do this again, and my goal was to transfer files ( And not learn from making the library) i would buy a already done library,
.NET has an FTP client you can use. I don't know how robust it is in the face of FTP server quirks; you'll have to test it against your customer's FTP server. As for verifying that the upload was successful, the only tools you have are (1) making sure there was no transport error during the upload, (2) validating the file size when you're done.
The FTP server isn't going to support transactions, so you're going to have to manage that yourself, but this isn't really a complicated scenario. Use a transaction for the DB update; backing out the FTP upload is one call.
try using Ftp with WCF