I am familiar with the FileSystemWatcher class, and have tested using this, alternatively I have tested using a fast loop and doing a directory listing of files of type in a directory. In this particular case they are zip compressed SDF files, I need to decompress, open, and query.
The problem is that when a large file is put in a directory, sometimes that takes time, such as it being downloaded, or copied from a network location, etc...
When the FileSystemWatcher raises an OnChange event, I have a handle to the ChangeType and on these types of operations the Create is immediate, while the file is still not completely copied to the location.
Likewise using the loop, I see a file is there, before the whole file is there.
The FileSystemWatcher raises several change events, one after create, and then one or more during the copy, nothing that says This file is now complete
So if I am expecting files of a type, to be placed in a directory ultimately to read and processed, with no knowledge of their transport mechanism, and no knowledge of their final size...
How do I know when the file is ready to actually be processed other than with using error control as a workflow control (albeit the error control is there anyway as it should be)? This just seems like a bad way to have to handle this, as sometimes the error control may actually be representing a legitimate issue, sometimes it may just be that the file is not completely written, and I do not see any real safe way to differentiate.
I despise anticipated error, but realize that is has its place like sockets, nothing guarantees a check for open does not change before an attempt to read/write. But I do avoid it at all costs.
This particular one troubles me mostly because of the ambiguity of the message that will be produced. There is a conflict queue for files that legitimately error because they did not come across entirely or are otherwise corrupt, I do not want otherwise good files going there. Getting more granular to detect this specific case will be almost impossible.
edit:
I know I can do this... And I have read the other SA articles concerning others doing the same thing. (And I know this method is both crude and blocking, it is just an example.)
private static void OnChanged(object source, FileSystemEventArgs e)
{
if (e.ChangeType == WatcherChangeTypes.Created)
{
bool ready = false;
while (!ready)
{
try
{
using (FileStream fs = new FileStream(e.FullPath, FileMode.Open))
{
Console.WriteLine(String.Format("{0} - {1}", e.FullPath, fs.Length));
}
ready = true;
}
catch (IOException)
{
ready = false;
}
}
}
}
What I am trying to find out is this definitively the only way, is there no other component, or some hook to the file system that will actually do this with a proper event?
The only way to tell is to open the file with FileShare.Read. That will always fail if the process is still writing to the file and hasn't closed it yet. There is otherwise no mechanism to know anything at all about which particular process is doing the writing, FSW operates at the file system device driver level and doesn't know anything about what process is performing the operation. Could be more than one.
That will very often fail the first time you try, FSW is very efficient. In general you have no idea how much time the process will take, it of course depends on how it is written and might leave the file opened for a while. Could be hours or days, a log file would be an example.
So you need a re-try mechanism, it should have an exponential back-off algorithm to increase the re-try delays between attempts. Start it off at, say, a half second delay and keep increasing that delay when it fails. This needs to be done in a worker thread, not the FSW callback. Use a thread-safe queue to pass the path of the file from the FSW callback to the worker thread. Also in general a good strategy to deal with the multiple FSW notifications you get.
Watch out for startup effects, you of course missed any notification before you started running so there might be a load of files that are waiting for work. And watch out for Heisenbugs, whatever you do with the file might cause another process to fall over. Much like this process did to yours :)
Consider that a batch-style program that you periodically run with the task scheduler could be an easier alternative.
For the one extreme, you could use a file system mini filter driver which analyzes all activities for a file at the lowest level (and communicates with a user mode application).
I wrote a proof-of-concept mini filter some time ago to detect MS Office file conversions. See below. This way, you can reliably check for every open handle to the file.
But: even this would be no universal solution for you problem:
Consider:
A tool (e.g. FTP file transfer) could in theory write part of the file, close it, and re-open it again for appending new data. This seems very curious, but you cannot reliably just check for “no more open file handles” ==> “file is ready now”
Alex K. provided a good link in his comment, and I myself would use a solution similar to the answer from Jon (https://stackoverflow.com/a/4278034/4547223)
If time is not critical (you can waste a few seconds for the decision):
Periodic timer (1 second seems reasonable)
Check file size in every timer tick
If file size did not increment for e.g. 10 seconds and there are no more FSWatcher change events too, try to open it. If you realize that the size increments take place uneven or very slowly, you could adjust the “wait time” on the fly.
Your big advantage is that you are processing ZIP files only, where you have a chance of detecting invalid (incomplete) files due to “checksum not valid”
I do not expect official ways to detect this, since there is no universal notion of “file written completely”.
File System mini filter
This may be like a sledgehammer solution for the problem.
Some time ago, I had the requirement of working around a weird bug in Office 2010, where it does not copy ADS meta data during office file conversion (ADS needed for File Classification). We discussed this with Microsoft engineers (MS was not willing to fix the bug), they complied with our filter driver solution (in the end, this was stopped since business preferred a manual workaround.)
Nevertheless, if someony really want to check if this could be a possible solution:
I have written an explanation of the steps:
https://stackoverflow.com/a/29252665/4547223
Related
I am developing a "dynamic shortcutting" application which creates special shortcut files which point to a registry entry rather than an actual file/executable. The registry entry contains the path of the desired file. I want to have a daemon running which watches the linked-to files and updates their registry entries if they are moved or renamed. Renamed I can handle using System.IO.FileSystemWatcher, but what is the best way to handle moved files?
I know this is beyond the basic functions of FSW (despite being a low-level file-system operation). The question is, what is the best way of doing it?
Most posts/articles I have read suggest ways that feel altogether "hacky", which basically involve looking for a delete followed by a create in a new place of a file, and connecting the two by file size, meta-data, time between the delete/create triggers, hashes, etc. This may well be the method I have to resort to, setting up FSWs on all drives. However, I am hoping there might be a better way.
Is it possible to either:
2.1. Listen in to the shell and "hear" move operations?
2.2 Or (even more radical) replace or add something to the shell move operation that either triggers some sort of event or performs the registry-updating task itself, precluding the need for the daemon?
I have a feeling that everyone is going to tell me that 1. is the only course, but I look forward to your suggestions. (answers in VB.NET preferred, but can translate from C# if necessary).
[I'm not sure if this should be appended as an "update" to my original post or posted as a separate answer]
To sum up (all two of) the answers plus my own experimenting (to try to give a definitive answer to this question):
It seems the only high-level (.NET) solution is to use the FileSystemWatcher which does not detect "move" out-of-the-box (despite it being a low-level command). The FSW approach is non-trivial, comparably resource-expensive, sloppy in places (i.e. using timers) and has its limitations and caveats. Nor does it provide a true reflection of "move" - it merely infers it from symptoms that are very likely to be a move (and have the same effect on the file-system in any case) but could theoretically be produced by non-move actions. Also, it appears you have to know what files you want to watch for moves in advance of the move happening, there's no-way of telling as it occurs.
On a lower-level (which would involve C++), one could hook API calls to get a faithful picture of when "moves" are called. This has the advantage that you don't have to decide to watch files in advance, and is also less resource-expensive than listening to "deletes" and "creates" and trying to compare them.
On a systems-programming level (which would involve C++ and could easily break your computer if you didn't know what you were doing) one could build a filesystem filter driver: this would take the concept of detecting moves to a truly anal level, detecting re-allocation of filesystem resources performed even without the kernel.
After some experimenting, here is the general structure of how the FileSystemWatcher approach (or at least the most obvious one to me) works, its quirks and its limitations. [no code atm, it's all pretty integrated into my application and I'm yet to optimise it, but I might add some snippets in here later].
The FileSystemWatcher method (to detect when files are moved or renamed):
.1. FileSystemWatchers.
You will need to create one FSW for each highest-level directory you want to monitor (for example, one for each writable logical drive).
.2. Renamed.
Straightforward renaming of the file is trivially handled.
.3. Moved.
This part is very far from trivial; it basically involves comparing files in three different scenarios.
3.0.1. Deciding if a deleted/moved-from file is the same as a created/moved-to file.
For determining whether a deleted and a created file are a match, filename is useless (can be changed during a move). You could use a mixture of file size and attributes like time created, or even a hash of the entire file. In my particular solution I only needed to watch the movement of specific files "registered" before load-time, so I was able to give these files a unique fingerprint as metadata that I could then use to compare files (this works fine in real-world scenarios, but is easy to break maliciously in testing, which disappoints me as a perfectionist.)
3.0.1.1. When to read filesize/attributes/take hash?
Before I came up with the static fingerprint idea, I was testing my code with a simple filesize + creation date validation check. I quickly realised though that I had to have a note of the filesize and creation date (or hash or whatever else you want to use) of the deleted file BEFORE it signals as "deleted", because you can't check the size of a file that doesn't exist. If (like me) you know the files you want to watch in advance, then you need to read in those values before you enable the FileSystemWatchers; you also need to listen for "change" events on those files to update the values of filesize and creation date, take a new hash etc. This then begs the question: what do you do if you DON'T know what files you are interested in watching to see if they move? What if you only know you are possibly interested in knowing if they've moved when they "delete"? That, unfortunately, is beyond me (it wasn't something I had to deal with.) Unless you can come up with a solution to this problem, there is zero point in continuing with the FileSystemWatcher approach. Furthermore, I would conjecture (though could very easily be wrong) that there is no high-level solution that will meet your needs. If you do however come up with a solution (please post it below/comment on this post/edit it in here on this post), I have made the rest of this compatible.
3.1. Scenario 1: Direct moving of the file itself.
Upon the "delete" of a specific file being detected, you need to start listening for a "create" of a congruous file. Rather than listening indefinitely for the matching "create" of a file that might just have been deleted (which in reality involves inspecting every file created in the directory), you can use a timer to start and stop a "listening" flag (practical, but from a purist point of view a little arbitrary), deciding that after e.g. 1000ms with no appropriately matching create it's likely there won't be one.
3.2.0. A common misconception.
A lot of people seem to be under the impression, after glancing at the docs, that moving or renaming a folder triggers a rename for all their subfiles and subfolders rather than a delete and a create. In actual fact what the docs say is:
If you cut and paste a folder with files into a folder being watched, the FileSystemWatcher object reports only the folder as new, but not its contents because they are essentially only renamed.
(i.e. only the top folder throws rename or create/delete and the subfiles/subfolders throw NOTHING). Meaning if you want to know when and where a certain file is moved, you have to listen out for each and every of its ascendent folders as well.
3.2.1. Scenario 2: Renaming of a containing folder.
In my solution, because I knew all the files I was watching, whenever one of my FileSystemWatchers reported a rename of a folder rather than a file (the portion of the string after the last "/" will contain no ".") I checked each of my watched files to see if their paths were in that directory and if so, changed the beginning of the filepath to the path of the new directory et voila!, I knew where my files had been moved to. If you do not now in advance what files you are looking for, then you will have to recursively search through everything in every folder that throws a "rename".
3.2.2. Scenario 3: Moving of a containing folder.
This one feels like a slap in the face: in order to build your move-detection routine, you have to be able to detect moves. Here folders will throw a "delete" followed by a "create". In my case the solution just recycles the techniques in 3.1 and 3.2.1: when a folder "delete" is detected, I check to see if it contains any of my watched files. If it does, I set a "listen" flag (and a timer to snuff it) and check the subdirectory path of my file in the old folder against every new folder "create" that is detected to see if it points to a file with the desired fingerprint. If it does, I now have the old and new paths of the file and have detected the move. If you don't know what files to watch for, you may have to validate folder moves by comparing size on disk and number of subfiles/subfolders between "deleted" folder and "created" folders to confirm a folder has moved first, then search the folder recursively for the files you're interested in.
3.3. FURTHER COMPLICATION: Cross-drive moving of large files.
This is a problem I fortunately didn't run into (because I was only comparing fingerprint metadata, and didn't need access to files); however moving large files between drives (which transfer in stages, triggering a create event then a series of change events) can cause real headaches.
3.3.1. Headache 1: The "create" fires when the destination file is incomplete.
This means comparing its size to a "deleted" file will produce a false negative. You can't even take a hash of the first part of the file to indicate to your program that this "might" be the deleted file, because the move operation will have the file access permissions locked down. You just have to try and tell if the created file might still be moving and wait for it to finish.
3.3.2. Headache 2: No sure way to "tell" that the created file is still being moved.
Some have suggested checking the file access permissions on the created file, but they might be indistinguishable from those on a file created and still in use by any random application. Others have suggested setting short time-limited listen flags for "changes" on the file, but again this is indistinguishable from a file being modified by an application. In fact if the file happened to be a log file constantly and rapidly being updated by some process, then waiting for "changes" to the file to timeout might never end.
3.3.3. Headache 3: (UNTESTED) possibly these sort of moves "delete" the file after "creating" the destination file*.
It makes sense that this would be the case, though I haven't tested it. [if anyone does know, feel free to edit (or delete) this section appropriately]
3.4. A philosophical quandry: are two identical files the same?
This is a very pedantic and arbitrary thought-experiment, but say you have two drives, each with an identical copy of File.txt. You run a batch file that deletes the copy on the first drive then immediately makes a copy of the file on the second drive into the same folder on the second drive and names it Copy of File.txt. Unless you are using fingerprints, your code will identify a delete and then a create of an identical file and be unable to distinguish what happened from a move (with renaming) of the file from the first drive to the second. The final state of the filesystem is identical in both cases so it shouldn't cause your application to behave unexpectedly, but art thou really content to call that a "move" based purely on isomorphism? (especially when you know the kernel sees it differently)?
Using high-level unrestricted api provided by C# - no, you cant. Use FileSystemWatcher.. On same drive operation of moving file is not "delete and create" - it's "rename".
If you can/want to go into lower-level, then you can hook MoveItem and MoveItems of IFileOperation shell's interface, and MoveFile from Kernel32.dll... It will work with most of apps, but require expansion for security rights for your application, that mostly unacceptable in corporative environment..
The task has two flaws that make it hard to implement: (a) move operation across the disks is actually a sequence of read/write operations followed by deletion rather than move. And during those read/write operations there can be some transformation of data in place ; and (b) moving can be performed not by just a shell.
What you can do is employ a filesystem filter driver to intercept file operations right when they take place. Then you need to detect the sequence of read and write operations performed by the same process over your file. I.e. if your code detects, that the file is read sequentially (NOTE: some copying tools can read the file in multiple threads in parallel) and then write similar blocks of data to the other file AND after reading everything the source file is deleted AND the complete file contents have been written to the other place, then you can guess that you have come over file move operation.
Bump & update: This may well be against the rules of StackOverflow, but I would like to point out to the many people landing on this page (and the myriad similar questions on SO) that I have started a feature request on MicroSoft UserVoice to add MOVE detection to FileSystemWatcher. The best solution in the long term, rather than trying to work around the problem, might be to petition MicroSoft to fix it. If you have come here because you too need a solution to this problem, please consider clicking here and voting for this feature.
I wrote a custom control for output file name selection with the typical: text box for the filename, a "browse" button, and some other functionality specific to my application.
The text box changes color depending on the filename. If the file location cannot be written to, it turns red. If the file already exist, it turns yellow. Otherwise, it remains the system-assigned color.
To see if a file exists, I use IO.File.Exists; simple enough.
I implemented the "if the file can be written to" as a simple try-catch block where a file is actually opened, something written in it, closed, then deleted. If at any point an exception is thrown, I know the user can't use that filename and I turn the text box red.
This is a catch-all; since I'm doing the actual operation I intend to do, it is fool-proof. However, it seems irresponsible to have software creating and deleting files like crazy just to see if it can.
So my question is, how do I replicate this functionality without creating files? I can see I have to:
Check the path for legality (e.g., 'z:' is not a valid filename). This entails parsing the path and making sure all directories exist.
If the location exists, I have to check for write permissions. (Several answered questions exist to this end.)
Is there anything else?
EDIT
Within minutes I see people are already voting up an answer that criticizes that I'm checking at all that the file is accessible before actual writing to it occurs. While I appreciate experts "standing back" from my question to see whether or not there is a completely different way to achieve it, telling me I shouldn't be doing it is not an answer to my question.
So let me elaborate on my application (I am not expecting hundreds of users at the same time).
I use this file chooser control in data acquisition applications. In many situations the test that you are about to run is "expensive" in one way or another. Therefore it is critical to set things up very carefully. Overwriting data can be very expensive (and for the fearful user I have a checkbox that will append the date and time down to the millisecond to the filename).
So the purpose of my indicator colors is not to provide a surefire way for the software to know the file can be written to (that check is still done at the instant it actually has to), it's to serve as an indicator to the user that at least he has set up the file name correctly so if he goes forward he is guaranteed not to overwrite old data and he's almost sure a last-minute IO error (filename typo) won't let the experiment run unrecorded.
I suggest this - don't check anything before user commits the action. With your current approach, even if you verified the file is okay, it may be locked 5 seconds later when the user actually commits to write to a file. Doing preliminary checks may only give user a false impression of estimated success. Especially consider this point on a terminal server with 100+ simultaneous users.
There is nothing wrong with showing a prompt with Retry/Cancel/etc. if no access, and let user decide.
EDIT:
No offense, but there are standards on how such collisions are handled. Windows standard is to show a prompt to the user. Also consider this - if you suddenly have a deny in write access to the folder, which you are not expected to have, you probably need to hire another system/network administrator.
If the operation is costly, make sure this guy is paid well. C'mon, what if your network goes down during writing? Hard drive? Router? There are many reasons why writing to a file can be interrupted, and you should be prepared for that. If you cannot afford it, make sure you have invested in good infrastructure and good people to support it.
Down on earth, you can increase chances of acquiring a successful lock on the file:
Pick a unique file name, using datetime-based hash as a suffix/prefix.
Write to user's home directory, also known as %UserProfile%, it is likely that you will succeed.
I can understand your problem with not wanting to risk losing "expensive" data because the file couldn't be written and a responsible program will do it's best to avoid the situation.
I would do this by cacheing the results. Before the test is run write a mock result to a file somewhere in the user data space, then leave the file open and write the real result to the file. After this is done write it to the user-specified file. Provide a recovery option that will read the cache file and write it out to the user's file.
Your approach could fail because just because the file was writable at the start doesn't mean it's still writable. The network could have gone down. Someone could have removed the flash drive. Someone else could be doing a large data transfer through a buggy router. (Real world case--it took me a long time to prove it was a network problem and not my program. finally accepted it was their fault when I showed that dir :*.* /s on multiple machines at once would almost certainly cause one or more to fail.)
I am implementing an event handler that must open and process the content of a file created by a third part application over which I have no control. I am warned by a note in "C# 4.0 in a nutshell" (page 495) about the risk to open a file before it is fully populated; so I am wondering how to manage this occurrence. To keep at minimum the load on the event handler, I am considering to have the handler simply insert in a queue the file names and then to have a different thread to manage the processing, but, anyways, how may I make sure that the write is completed and the file read is safe? The file size could be arbitrary.
Some idea? Thanks
A reliable way to achieve what you want might be to use FileSystemWatcher + NTFS USN journal.
Maybe more complicated than you expected, but FileSystemWatcher alone won't tell you for sure that the newly created file has been closed
-first, the FileSystemWatcher, to know when a file is created. From there you have the complete file path, and are 1 or 2 pinvokes away from getting the file unique ID (which can help you to track it during its whole lifetime).
-then, read the USN journal, which tracks everything that occurs on your drive. Filter on entries corresponding to your new file's ID, and read the journal until reaching the entry with the 'Close' event.
From there, unless your file is manipulated in special ways (opened and closed multiple times by the application that generates it), you can assume it is safe to read it and do whatever you wanted to do with it.
A really great C# implementation of an USN journal parser is StCroixSkipper's work, available here:
http://mftscanner.codeplex.com/
If you are interested I can give you more help about USN journal, as I use it in my project.
Our workaround is to watch for a specific extension. When a file is uploaded, the extension is ".tmp". When its done uploading, it's renamed to have the proper extension.
Another alternative is to have the server try to move the file in a try/catch block. If the fie isn't done being uploaded, the attempt to move the file will throw an exception, so we wait and try again.
Realistically, you can't know. If the other applications "write" operation is to open the file denying write access to everyone else then when it's done, close the file. When you get a notification then you could simply open the file requesting write access and if that fails, you know the operation isn't complete. But, if the "write" operation is to open the file, write, close the file, open the file again, and write again, etc., then you're pretty much out of luck.
The best solution I've seen is to set a timer after the last notification. When the timer elapses, try to open the file for write--if you can, assume the "operation" is done and do what you need to do. If the open fails, assume the operation is still in progress and wait some more.
Of course, nothing is foolproof. Despite the above, another operation could start while you're doing what you want with the file and cause interaction problems.
I am working on an app that will keep a running index of work in accomplished.
I could write once at the end of a work session, but I don't want to risk losing data if something blows up. Therefore, I rewrite to disk (XML) every time a new entry or a correction is made by the user.
private void WriteIndexFile()
{
XmlDocument IndexDoc
// Build document here
XmlTextWriter tw = new XmlTextWriter(_filePath, Encoding.UTF8);
tw.Formatting = Formatting.Indented;
IndexDoc.Save(tw);
}
It is possible for the writes to be triggered in rapid succession. If this happens, it tries to open the file for writing before the prior write is complete. (While it would not be normal, I suppose it is possible that the file gets opened for use by another program.)
How can I check if the file can be re-written?
Edit for clarification: This is part of an automated lab data collection system. The users will click a button to capture data (saved in separate files), and identify the sub-task the the data package is for. Typically, it will be 3-10 minutes between clicks.
If they make an error, they need to be able to go back and correct it, so it's not an append-only usage.
Finally, the files will be read by other automated tools and manually by humans. (XML/XSLT)
The size will be limited as each work session (worker shift or less) will have a new index file generated.
Further question: As the overwhelming consensus is to not use XML and write in an append-only mode, how would I solve the requirement of going back and correcting earlier entries?
I am considering having a "dirty" flag, and save a few minutes after the flag is set and upon closing the work session. If multiple edits happen in that time, only one write will occur - no more rapid user - also have a retry/cancel dialog if the save fails. Thoughts?
XML is a poor choice in your case because new content has to be inserted before the closing tag. Use Text istead and simply open the file for append and write the new content at the end of the file, see How to: Open and Append to a Log File.
You can also look into a simple logging framework like log4net and use that instead of handling the low level file stuff urself.
If all you want is a simple log of all operations, XML may be the wrong choice here as it is difficult to append to an XML document without rewriting the whole file, which will become slower and slower as the file grows.
I'd suggest instead File.AppendText or even better: keeping the file open for the duration of the aplication's life time and using WriteLine.
(Oh, and as others have pointed out, you need to lock to ensure that only one thread writes to the file at a time. This is still true even with this solution.)
There are also logging frameworks that already solve this problem, such as log4net. Have you considered using an existing logging framework instead of rolling your own?
I have a logger that uses System.Collections.Queue. Basically it waits until something is queued then trys to write it. While writing items, which could be slow, more items could be added to the queue.
This will also help in just grouping messages rather than trying to keep up. It is running on a separate thread.
private AutoResetEvent ResetEvent { get; set; }
LogMessage(string fullMessage)
{
this.logQueue.Enqueue(fullMessage);
// Trigger the Reset Event to send the
this.ResetEvent.Set();
}
private void ProcessQueueMessages()
{
while (this.Running)
{
// This will process all the items in the queue.
while (this.logQueue.Count > 0)
{
// This method will just log the top item on the queue
this.LogQueueItem();
}
// Once the queue is empty will wait for a
// another message to queueed before running again.
// Rather than sleeping and checking if the queue is full,
// saves from doing a System.Threading.Thread.Sleep(1000); stuff
this.ResetEvent.WaitOne();
}
}
I handle write failures but not dequeueing until it wrote to the file with no errors. Then I just keep attempting until it finally can write. This has saved me because somebody removed permissions from one of our apps during it process. Permission was given back with out shutting down our app, and we didn't lose a single log statement.
Consider using a flat text file. I have a process that I wrote that uses an XML log... it was a poor choice. You can't just write out the state as you run without having to constantly rewrite the file to make sure the tags are correct. If it was flat entries written to a file you could have an automatic timeline that could give you details of what happened without trying to figure out if it was the XML writer/tag set that blew up and you don't have to worry about your logs bloating out as much.
I agree with others suggesting you avoid XML. Also, I would suggest you have one component (a "monitor") that is responsible for all access to the file. That component will have the job of handling multiple simultaneous requests and making the disk writes happen one after another.
When I call FileInfo(path).LastAccessTime or FileInfo(path).LastWriteTime on a file that is in the process of being written it returns the time that the file was created, not the last time it was written to (ie. now).
Is there a way to get this information?
Edit: To all the responses so far. I hadn't tried Refresh() but that does not do it either. I am returned the time that the file was started to be written to. The same goes for the static method, and creating a new instance of FileInfo.
Codymanix might have the answer, but I'm not running Windows Server (using Windows 7), and I don't know where the setting is to test.
Edit 2: Nobody finds it interesting that this function doesn't seem to work?
The FileInfo values are only loaded once and then cached. To get the current value, call Refresh() before getting a property:
f.Refresh();
t = f.LastAccessTime;
Another way to get the current value is by using the static methods on the File class:
t = File.GetLastAccessTime(path);
Starting in Windows Vista, last access time is not updated by default. This is to improve file system performance. You can find details here:
http://blogs.technet.com/b/filecab/archive/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance.aspx
To reenable last access time on the computer, you can run the following command:
fsutil behavior set disablelastaccess 0
As James has pointed out LastAccessTime is not updated.
The LastWriteTime has also undergone a twist since Vista. When the process has the file still open and another process checks the LastWriteTime it will not see the new write time for a long time -- until the process has closed the file.
As a workaround you can open and close the file from your external process. After you have done that you can try to read the LastWriteTime again which is then the up to date value.
File System Tunneling:
If an application implements something like a rolling logger which closes the file and then renames it to a different file name you will also run into issues since the creation time and file size of the "old" file is remembered by the OS although you did create a new file. This includes wrong reports of the file size even if you did recreate log.txt from scratch which is still 0 bytes in size. This feature is called OS File System Tunneling which is still present on Windows 8.1 . An example how to work around this issue check out RollingFlatFileTracelistener from Enterprise Library.
You can see the effects of file system tunneling on your own machine from the cmd shell.
echo test > file1.txt
ren file1.txt file2.txt
Wait one minute
echo test > file1.txt
dir /tc file*.txt
...
05.07.2015 19:26 7 file1.txt
05.07.2015 19:26 7 file2.txt
The file system is a state machine. Keeping states correctly synchronized is hard if you care about performance and correctness.
This strange tunneling syndrome is obviously still used by application which do e.g. autosave a file and move it to a save location and then recreate the file again at the same location. For these applications it makes to sense to give the file a new creation date because it was only copied around. Some installers do also such tricks to move files temporarily to a different location and write the contents back later to get past some file exists check for some install hooks.
Have you tried calling Refresh() just before accessing the property (to avoid getting a cached value)? If that doesn't work, have you looked at what Explorer shows at the same time? If Explorer is showing the wrong information, then it's probably something you can't really address - it might be that the information is only updated when the file handle is closed, for example.
There is a setting in windows which is sometimes set especially on server systems so that modified and accessed times for files are not set for better performance.
From MSDN:
When first called, FileSystemInfo
calls Refresh and returns the
cached information on APIs to get
attributes and so on. On subsequent
calls, you must call Refresh to get
the latest copy of the information.
FileSystemInfo.Refresh()
If you're application is the one doing the writing, I think you are going to have to "touch" the file by setting the LastWriteTime property your self between each buffer of data you write. Some psuedocode:
while(bytesWritten < totalBytes)
{
bytesWritten += br.Write(buffer);
myFileInfo.LastWriteTime = DateTime.Now;
}
I'm not sure how severely this will affect write performance.
Tommy Carlier's answer got me thinking....
A good way to visualise the differences is seperately running the two snippets (I just used LinqPAD) simliar to below while also running sysinternals Process Monitor.
while(true)
File.GetLastAccessTime([file path here]);
and
FileInfo bob = new FileInfo(path);
while(true){
string accessed = bob.LastAccessTime.ToString();
}
If you look at Process Monitor while running the first snippet you will see repeated and constant access attempts to the file for the LinqPAD process. The second snippet will do an initial access of the file, for which you will see activity in process monitor, and then very little afterwards.
However if you go and modify the file (I just opened the text file I was monitoring using FileInfo and added a character and saved) you will see a series of access attempts by the LinqPAD process to the file in process monitor.
This illustrates the non-cached and cached behaviour of the two different approachs respectively.
Will the non-cached approach wear a hole in the hard drive?!
EDIT
I went away feeling all clever over my testing and then used the caching behaviour of FileInfo in my windows service (basically to sit in a loop and say 'Has-file-changed-has-file-changed...' before doing processing)
While this approach worked on my dev box, it did not work in the production environment, ie the process just kept running regardless if the file had changed or not. I ended up changing my approach to checking and just used GetLastAccessTime as part of it. Don't know why it would behave differently on production server....but I am not too concerned at this point.