C# find when file was uploaded on FTP

C# find when file was uploaded on FTP - c#

I have a job that once in a set periods of time "looks" at FTP if some new files have been uploaded. Once it finds any, it downloads it.
The question is how using C# to extract the time when the file was actually uploaded to FTP.
Thank you. I just still can't figure out how to extract exactly the time when file was uploaded to FTP, not modified. As the following shows the time of file modification.
fileInfo = session.GetFileInfo(FileFullPath);
dateUploaded = fileInfo.LastWriteTime;
Please advice some sample code that may be integrated in my current solution
using (Session session = new Session())
{
string FileFullPath =
Dts.Variables["User::FTP_FileFullPath"].Value.ToString();
session.Open(sessionOptions);
DateTime dateTime = DateTime.Now;
session.MoveFile(FileFullPath, newFTPFullPath);
TransferOperationResult transferResult;
transferResult = session.GetFiles(newFTPFullPath,
Dts.Variables["User::Local_DownloadFolder"].Value.ToString(),false);
Dts.Variables["User::FTP_FileProcessDate"].Value = dateTime;
}

You might not be able to, unless you know the FTP server reliably sets the file create/modified date to the date it was uploaded. Do some test uploads and see. If it works out for you on this particular server you want to use then great; keep a note of when you last visited and retrieve files with a greater date. By way of an example, a test upload to an Azure ftp server just now (probably derived from Microsoft IIS) did indeed set the time of the file to the datetime it was uploaded. Beware that the listed file time sent by the server might not be the same timezone as you are in, nor will it have any timezone info represented - it could just be some number of hours out relative to your current time
To get the date itself you'll need to parse the response the server gives you when you list the remote directory. If you're using an FTP library for C# (edit: you're using WinSCP), that might already be handled for you (edit: it is, see https://winscp.net/eng/docs/library_session_listdirectory and https://winscp.net/eng/docs/library_remotefileinfo); unless things have improved recently the default FTP provision in .NET isn't great - it's more intended for basic file retrieval than complex syncing, so i'd definitely look at using a capable library (and we don't do software recs here, sorry, so I can't recommend one) if you're scrutinizing the date info offered
That said, there's another way to carry out this sync process that is more of a side effect of what you want to do (and doesn't necessarily rely on parsing a non standard list output) overall as a process:
Keep a memory of every file you saw last time and reference it when looking at every file that is there now. This is actually quite easy to do:
Download all the files.
Disconnect.
Go back some time later and download any files that you don't already have
Keep track of which files you downloaded and do something with them?
You say you want to download them anyway, so just treat any file you don't already have (or maybe one that has a newer date, different file size etc) as one that is new/changed since you last looked
Big job, potentially, depending how many various servers you want to support

Related

.NET WinSCP, only download new files

I've created a program that's supposed to run once each night. What it does is that it downloads images from my FTP, compresses them and uploads them back to the FTP. I'm using WinSCP for downloading and uploading files.
Right now I have a filemask applied that makes sure that only images are downloaded, that subdirectories are excluded and most importantly that only files that are modified the last 24 hours are downloaded. Code snippet for this filemask:
DateTime currentDate = DateTime.Now;
string date = currentDate.AddHours(-24).ToString("yyyy-MM-dd");
transferOptions.FileMask = "*.jpg>=" + date + "; *.png>=" + date + "|*/";
Thing is, as I'm about to publish this I realize that if I run this once per night, and it checks if files are modified the last 24 hours, it will just keep downloading and compressing the same files, as the modified timestamp will keep increasing for each compression.
To fix this I need to edit the FileMask to only download NEW files, ie files that weren't in the folder the last time the program was run. I don't know if you can check the Created By-timestamp in some way, or if I have to do some comparisons. I've been looking through the docs but I haven't found any solution to my specific use case.
Is there anyone experienced in WinSCP that can point me in the right direction?

It doesn't look like WinSCP can access the Created Date of the files.
Unless you can do something to make the files 'different' when you re-upload them (e.g. put them in a different folder) then you best option might be:
Forget about using FileMask
Use WinSCP method EnumerateRemoteFiles to get a list of the files
Loop through them yourself (its a collection of RemoteFileInfo objects
You'll probably need to keep a list of 'files already processed' somewhere and compare with that list
Call GetFiles for the specific files that you actually want

There's a whole article on WinSCP site on How do I transfer new/modified files only?
To summarize the article:
If you keep the past files locally, just run synchronization to download only the modified/new ones.
Then iterate the list returned by Session.SynchronizeDirectories to find out what the new files are.
Otherwise you have to use a time threshold. Just remember the last time you ran your application and use a time constraint that includes also a time, not just a date.
string date = lastRun.ToString("yyyy-MM-dd HH:mm:ss");
transferOptions.FileMask = "*.jpg>=" + date + "; *.png>=" + date + "|*/";

Amazon S3, Syncing, Modified date vs. Uploaded Date

We're using the AWS SDK for .NET and I'm trying to pinpoint where we seem to be having a sync problem with our consumer applications. Basically we have a push-service that generates changeset files that get uploaded to S3, and our consumer applications are supposed to download these files and apply them in order to sync up to the correct state, which is not happening.
There's some conflicting views on what/where the correct datestamps are represented. Our consumers were written to look at the s3 file's "LastModified" field to sort the downloaded files for processing, and I don't know anymore what this field represents. At first I thought it represented the date modified/created of the file we uploaded, then (as seen here) it actually represents a new date stamp of when the file was uploaded, and likewise in the same link it seems to imply that when a file is downloaded it reverts back to the old datestamp (but I cannot confirm this).
We're using this snippet of code to pull files
// Get a list of the latest changesets since the last successful full update.
Amazon.S3.AmazonS3Client client = ...;
List<Amazon.S3.Model.S3Object> listObjects = client.GetFullObjectList(
this.Settings.GetS3ListObjectsRequest(this.Settings.S3ChangesetSubBucket),
Amazon.S3.AmazonS3Client.DateComparisonType.GreaterThan,
lastModifiedDate,
Amazon.S3.AmazonS3Client.StringTokenComparisonType.MustContainAll,
this.Settings.RequiredChangesetPathTokens);
And then sort by the S3Object's LastModified (which I think is where our assumption is wrong)
foreach (Amazon.S3.Model.S3Object obj in listObjects)
{
if (DateTime.Parse(obj.LastModified) > lastModifiedDate)
{
//it's a new file, so we use insertion sort to put this file in an ordered list
//based on LastModified
}
}
Am I correct in assuming that we should be doing something more to preserve our own datestamps that we need, such as using custom header/metadata objects to put the correct datestamps on files that we need, or even putting it in the filename itself?
EDIT
Perhaps this question can answer my problem: If my service has 2 files to upload to S3 and goes through the process of doing that, am I guaranteed that these files show up in S3 in the order they were uploaded (via LastModified) or does S3 do some amount of asynchronous processing that could lead to my files showing up in a list of S3 object out of order? I'm worried about a case where, for example, my service uploaded files A then B, B shows up first in S3, my consumers get + process B, then A shows up, and then my consumers may or may not get A and incorrectly process it thinking it's newer when it's not?
EDIT 2
It was as I and the person below suspected and we had some racing conditions trying to apply changesets in order while blindly relying on S3's datestamps. As an addendum, we ended up making 2 fixes to try and address the problem, which might be useful for others as well:
Firstly, to address to the race condition between when our uploads finish and the modified dates reported by S3, we decided to make all our queries look into the past by 1 second from the last date modified we read from a pulled file in S3. In examining this fix we saw another problem in S3 that wasn't apparent before, namely that S3 does not preserve milliseconds on timestamps, but rather rounded them up to the next second for all its timestamps. Looking back in time by 1 second circumvented this.
Secondly, since we were looking back in time we would have the problem of downloading the same file multiple times if there weren't any new changeset files to download, so we added a filename buffer for files we saw in our last request, skipped any files we had already seen, and refreshed the buffer when we saw new files.
Hope this helps.

When listing objects in an S3 bucket, the API response received from S3 will always return them in alphabetical order.
The S3 API does not allow you to filter or sort objects based on the LastModified value. Any such filtering or sorting is done exclusively in the client libraries that you use to connect to S3.
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html
As for the accuracy of the LastModified value and it's possible use to sort the list of objects based on the time they were uploaded, to my knowledge, the LastModified value is set to the time the upload finishes (when the server returns a 200 OK response) and not the time the upload was started.
This means that if you start upload A that's 100MB in size and a second later you start upload B that's only 1K in size, in the end, the last modified timestamp for A will be after the last modified timestamp for B.
If you need to preserve the time your upload was started, it's best to use a custom metadata header with your original PUT request.

game client updater

I'm a MMORPG Private Server dev and I'm looking forward to create an all in one updater for the user's clients because it is very annoying and lame to use patches that must be manually downloaded.
I'm new in C# but I already succeded at making my own launcher with my own interface and basic game start/options buttons and notice that is read from my webserver.
Now, I wanna make an integrated update function for that and I'm pretty lost, I have no idea where to start. This is what it would look like, it's just a concept
It will have the main button wich is used to start the game AND update it, basically when you open the program the button would write "UPDATE" and is disabled(while it searches for new updates) and if any are found, it would turn into a clickable button, then after the updates are dowmloaded it would just change itself into "start game".
A progressbar for the overall update and another one to see the progress on the current file that is downloading only, all that with basic info like percentage and how much files need to be downloaded.
I need to find a way for the launcher to check the files on the webserver by HTTP method and check if they are same as client or newer ones so it dosent always redownload files already same version and also a method so that the updater will download the update as a compressed archive and auto extract and overwrite existing files when they are done downloading.
NOTE: The files being updated are not .exe, they mostly are textures/config files/maps/images/etc...

I'll sketch a possible architecture for this system. It's incomplete, you should consider it a form of detailed pseudo-C#-code for the first half and a set of hints and suggestions for the second.
I believe you may need two applications for this:
A C# WinForms client.
A C# server-side application, maybe a web service.
I'll not focus on security issues on this answers, but they are obviously very important. I expect that security can be implemented at a higher level, maybe using SSL. The web service would run within IIS and implementing some form of security should be mainly a matter of configuration.
The server-side part is not strictly required, especially if you do't want compression; probably there is a way to configure your server so that it returns a easily parsable list of files when an HTTP request is made at website.com/updater. However it is more flexible to have a web service, and probably it's even easier to implement. You can start by looking at this MSDN article. If you do want compression, you can probably configure the server to transparently compress individual files. I'll try to sketch all possible variants.
In the case of a single update ZIP file, basically the updater web service should be able to answer two different requests; first, it can return a list of all game files, relative to the server directory website.com/updater, together with their last write timestamp (GetUpdateInfo method in the web service). The client would compare this list with the local files; some files may not exist anymore on the server (maybe the client must delete the local copy), some may not exist on the client (they are entirely new content), and some other files may exist both on the client and on the server, and in that case the client needs to check the last write time to determine if it needs the updated version. The client would build a list of the paths of these files, relative to the game content directory. The game content directory should mirror the server website.com/updater directory.
Second, the client sends this list to the server (GetUpdateURL in the web service). The server would create a ZIP containing the update and reply with its URL.
[ServiceContract]
public interface IUpdater
{
[OperationContract]
public FileModified[] GetUpdateInfo();
[OperationContract]
public string GetUpdateURL();
}
[DataContract]
public class FileModified
{
[DataMember]
public string Path;
[DataMember]
public DateTime Modified;
}
public class Updater : IUpdater
{
public FileModified[] GetUpdateInfo()
{
// Get the physical directory
string updateDir = HostingEnvironment.MapPath("website.com/updater");
IList<FileModified> updateInfo = new List<FileModified>();
foreach (string path in Directory.GetFiles(updateDir))
{
FileModified fm = new FileModified();
fm.Path = // You may need to adjust path so that it is local with respect to updateDir
fm.Modified = new FileInfo(path).LastWriteTime;
updateInfo.Add(fm);
}
return updateInfo.ToArray();
}
[OperationContract]
public string GetUpdateURL(string[] files)
{
// You could use System.IO.Compression.ZipArchive and its
// method CreateEntryFromFile. You create a ZipArchive by
// calling ZipFile.Open. The name of the file should probably
// be unique for the update session, to avoid that two concurrent
// updates from different clients will conflict. You could also
// cache the ZIP packages you create, in a way that if a future
// update requires the same exact file you would return the same
// ZIP.
// You have to return the URL of the ZIP, not its local path on the
// server. There may be several ways to do this, and they tend to
// depend on the server configuration.
return urlOfTheUpdate;
}
}
The client would download the ZIP file by using HttpWebRequest and HttpWebResponse objects. To update the progress bar (you would have only one progress bar in this setup, check my comment to your question) you need to create a BackgroundWorker. This article and this other article cover the relevant aspects (unfortunately the example is written in VB.NET, but it looks very similar to what would be in C#). To advance the progress bar you need to keep track of how many bytes you received:
int nTotalRead = 0;
HttpWebRequest theRequest;
HttpWebResponse theResponse;
...
byte[] readBytes = new byte[1024];
int bytesRead = theResponse.GetResponseStream.Read(readBytes, 0, 4096);
nTotalRead += bytesread;
int percent = (int)((nTotalRead * 100.0) / length);
Once you received the file you can use System.IO.Compression.ZipArchive.ExtractToDirectory to update your game.
If you don't want to explicitly compress the files with .NET, you can still use the first method of the web service to obtain the list of updated file, and copy the ones you need on the client using an HttpWebRequest/HttpWebResponse pair for each. This way you can actually have two progress bars. The one that counts files will simply be set to a percentage like:
int filesPercent = (int)((nCurrentFile * 100.0) / nTotalFiles);
If you have another way to obtain the list, you don't even need the web service.
If you want to individually compress your files, but you can't have this feature automatically implemented by the server, you should define a web service with this interface:
[ServiceContract]
public interface IUpdater
{
[OperationContract]
public FileModified[] GetUpdateInfo();
[OperationContract]
public string CompressFileAndGetURL(string path);
}
In which you can ask the server to compress a specific file and return the URL of the compressed single-file archive.
Edit - Important
Especially in the case that your updates are very frequent, you need to pay special attention to time zones.
Edit - An Alternative
I should restate that one of the main issues here is obtaining the list of files in the current release from the server; this file should include the last write time of each file. A server like Apache can provide such a list for free, although usually it is intended for human consumption, but it is easily parsable by a program, nevertheless. I'm sure there must be some script/extension to have that list formatted in an even more machine-friend way.
There is another way to obtain that list; you could have a text file on the server that, for every game content file, stores its last write time or, maybe even better, a progressive release number. You would compare release numbers instead of dates to check which files you need. This would protext yourself from time zone issues. In this case however you need to maintain a local copy of this list, because files have no such thing as a release number, but only a name and a set of dates.

This is a wide and varied question, with several answers that could be called 'right' depending on various implementation requirements. Here are a few ideas...
My approach would be to use System.Security.Cryptography.SHA1 to generate a list of hash codes for each game asset. The updater can then download the list, compare it to the local file system (caching the locally-generated hashes for efficiency) and build a list of new/changed files to be downloaded.
If the game data uses archives, the process gets a bit more involved, since you don't want to download a huge archive when only a single small file inside may have been changed. In this case you'd want to hash each file within the archive and provide a method for downloading those contained files, then update the archive using the files you download from the server.
Finally, give some thought to using a Binary Diff/Patch algorithm to reduce the bandwidth requirements by downloading smaller patch files when possible. In this case the client would request a patch that updates from the current version to the latest version of a file, sending the hash of the local file so the server knows which patch to send. This requires you to maintain a stack of patches on the server for each previous version you want to be able to patch from, which might be more than you're interested in.
Here are some links that might be relevant:
SHA1 Class - Microsoft documentation for SHA1 hashing class
SevenZipSharp - using 7Zip in C#
bsdiff.net - a .NET library implementing bsdiff
Oh, and consider using a multi-part downloader to better saturate the available bandwidth at the client end. This results in higher load on the server(s), but can greatly improve the client-side experience.

Periodically finding changed files in a directory using .NET

I'm trying to find the most reliable way of finding new and modified files in a directory using C# and .NET. I'm not looking for a real time solution, I want to check for changes at given times. It could be every 5 minutes or every hour etc.
We have CreationTime and LastWriteTime on the FileInfo object, and this seems to be enough to get new and modified files. But if a file is renamed none of the available dates are changed and the file will be missed if we just look at CreationTime and LastWriteTime.
At the moment i'm maintaining af "snapshot" of the files in the directory including the time of the last check for changes. This enables me to compare all the files in the directory with files in the snapshot, if the snapshot is missing a file it is either new or renamed.
Is this the only way? Rr am I missing something. I'm not going to use FileSystemWatcher as it seems pretty "buggy" and is required to run all the time.
Any suggestions are very welcome.
Merry Christmas!

Use the FileSystemWatcher class, it's the good way. Maybe you could be more specific with the
as it seems pretty "buggy"
EDIT: FileSystemWatcher does support renaming events.

The Microsoft Sync Framework has components for synchronising files.
The framework covers all data types and data storage and the file system component should be more reliable than the FileSystemWatcher. As it says in the MSDN:
It can be used to synchronize files and folders in NTFS, FAT, or SMB file systems. The directories to synchronize can be local or remote; they do not have to be of the same file system. An application can use static filters to exclude or include files either by listing them explicitly or by using wildcard characters (such as *.txt). Or the application can set filters that exclude whole subfolders. An application can also register to receive notification of file synchronization progress
I know you really only want to know when files have changed, but given that you've already dismissed the FileSystemWatcher route it might be the only reliable route (other than doing what you are in maintaining a snapshot yourself).

Your problem looks very much like a Database with no primary key.
If you asssign, say, a GUID to each file in that folder and check for that GUID instead of the filename, your application will be much more reliable.
So that's the theory, in practice, we're talking metadata. Depending on your system, and the files contained in that folder, you could use Alternate Data Streams.
Here is a SO question about it.
It boils down to having information on a file that is not stored within the file, it is merely linked to it.
You can then look it up in a DOS box:
notepade.exe myfile.txt:MYGUID
It requires the system to use NTFS.
HTH.

A very primitive approach would use a command of "dir" and comparing outputs...
Here is some info on params:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/dir.mspx?mfr=true
Along with you snapshot of dates, you can compare the outputs of dir... its very fast and low on resources consuming...

File system watcher class in .net provides two methods:-
OnChanged
OnRenamed
You can set the EnableRaisingEvents to true and that is!!!! Every thing is simple with Dot Net chill!!!

C#: Programmatically apply merge/patch to file?

I have a program that requires a few large (~4 or 5mb) files. Once a week, every week, there are new versions of these files with minor changes. Mostly just a few lines added or removed.
When the program starts, if there's an Internet connection, I'd like the program to update these files automatically. Instead of downloading the entire new versions of the files, I'll like to download just a patch based on the client's version of the files that updates them.
How might I do this?
I have total control over the server.

That is a tough problem to solve if you don't have any for knowledge of what is in the file or the server doest have a facility to allow you to request differences. Any program you write that does not have a way to determine the differences with out looking at the old and new file will have to download it anyway.

C# doesn't have any built-in facility to do this, but it sounds like your requirements aren't complicated. Look at how diff and ed on Unix can be used to patch a text file based on an easy-to-grok delta. Of course you should check the resulting file against a hash and fall back to a full download if it isn't correct.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.