I often use library functions like File.Exists to check for a file's existence before opening it or doing some other action. While I have had good luck with this in practice over the years, I wonder if it is a poorly thought-out pattern.
Any IO call like a file system read can fail for multiple reasons. The path string could be wrong or the file actually not exist, you could lack permissions, someone else might have a lock that blocks you. You could even have another process or another user on the network move a file in the millisecond between your File.Exists and your Open.
Even if you get a successful result from File.Exists, you still really should enclose your actual open statements in a try block to handle one of the other possible failure modes. If I am thinking about this correctly, File.Exists just lulls you into a false sense of safety if you use it instead of Try (as I am sure that I have on occasion in the past).
All of this makes it sound like I should abandon File.Exists and change whatever existing code I find to use the Try...Catch pattern only. Is this a sensible conclusion? I realize that the framework authors but it there for us to use, but that does not automatically make it a good tool in practice.
I think that the answer completely depends on your specific reasons for using File.Exists.
For example, if you are checking a certain file path for arrival of a file, File.Exists could easily be the appropriate approach because you don't care what the reason for non-existence is.
However, if you are processing a file that an end user has requested (i.e. please import this excel file), you will want to know exactly why the file has failed. In this specific instance, File.Exists wouldn't be quite the right approach because the file existence could change between the time you check it and the time you open the file. In this case, we attempt to open the file and obtain a lock on it prior to processing it. The open method will throw the errors appropriate to the specific scenario that you are handling so you can provide the user more accurate information about the problem (i.e. another process has the file locked, the network file is not available, etc.).
You should absolutely implement an exception handler for any code that could reasonably throw an exception and any I/O operation falls into that category.
That doesn't mean that using File.Exists is wrong though. If there is a reasonable possibility that the file may not exist then prevention is more efficient than cure. If the file absolutely should exist though, it might be more performant overall to suffer the occasional exception rather than take the hit of checking first every time.
I use File.Exists in cases where the file may not exist under normal operating conditions (without something being broken in my system). If I know the file should exist (unless my system is broken), then I don't use File.Exist.
I wouldn't call this a "pattern" though. The best pattern is to consider what you're doing on a case by case basis.
It is up to you how you want to handle the File not found. If you have used File.Exists to check if file is there or not, or you can also use try catch block around your code and handle FilenotFound exception this will identify weather a file exists or not. Its purely up to you but I would prefer to check for File.Exists. It is same like checking null for accesing object rather than writing try catch around your code block and identify in catch that you object is null. It is always good to handle such validations at your end rather leaving it to c# try catch block.
Related
I do get reports via Crashlytics, that some of the Users of my Unity app (roughly 0.5%) get an UnauthorizedAccessException when I call FileInfo.Length;
the interesting part of the stacktrace is:
Non-fatal Exception: java.lang.Exception
UnauthorizedAccessException : Access to the path '/storage/emulated/0/Android/data/com.myCompany.myGreatGame/files/assets/myAsset.asset' is denied.
System.IO.__Error.WinIOError (System.IO.__Error)
System.IO.FileInfo.get_Length (System.IO.FileInfo)
The corresponding File (it's a different file for every report) was written (or is currently written) by the same application (possibly many sessions earlier). The call happens in a backgroundthread and there might be some writing going on at the same time. But according to the .net doc this property should be pre-cached (see https://learn.microsoft.com/en-us/dotnet/api/system.io.fileinfo.length?view=netframework-2.0)
The whole code causing it is:
private static long DirSize(DirectoryInfo d)
{
long size = 0;
FileInfo[] fileInfos = d.GetFiles();
foreach (FileInfo fileInfo in fileInfos)
{
size += fileInfo.Length;
}
...
Did anyone experience something similar and knows what might be causing it?
This looks like a very exotic error, and because of that, I have no evidence to back up my suggestions.
Suggestion 1:
User has installed Antivirus software - Those applications work sometimes like malware, locking files that are not used by the host program to test them (especially if they want to prevent malicious behavior). This would explain the rare nature of the error. I would try to see permissions of the file after the failed call of the Length method, this might give you (and possibly us) more insights.
Suggestion 2:
You cannot read length when the application is actively writing to this file in some circumstances. This should never happen but bugs happen even in OS. Possible path: Some application is writing to the File. The file is modified and metadata (including Lenght) is written, while it happens you are reading length from another thread, OS Locks the file from reading metadata (including Length), while metadata is written (probably for security reasons)
Suggestion 3 (and most probable):
Bad SD Card/Memory/CPU - Some random errors always can happen because you do not control the client's hardware. I would check if this 0.5% of errors are not from one User, or seemingly from multiple users but because of other issues with hardware, their unique ID resets (check for other data like phone model as this might also give you clues).
You are most likely trying to access a file you don't have permissions to access. There are certain files that even Administrator cannot access.
You could do a Try/Catch block to handle the exception.
See this question.
If you read carefully Microsoft's documentation it clearly states that:
an I/O Error is thrown in case the Refresh fails
The FileInfo.Length Property is pre-cached only in a very precise list of cases (GetDirectories, GetFiles, GetFileSystemInfos, EnumerateDirectories, EnumerateFiles, EnumerateFileSystemInfos). The cached info should be refreshed by calling the Refresh() method.
Interpolating #1 and #2 you easily identify the problem: while you try to get that information, you have a file open with an exclusive lock, which gives you the error in #1. I would suggest approaching this implementing two different logics, one is the obvious try/catch block, but because that block (a) costs in performances and (b) doesn't solve the logical problem of knowing the file size, you also should cache those data yourself when you acquire the exclusive lock.
Put those in a static table in memory, a simple key/value (file/size), and check against it before to call FileInfo.Length(). Basically, when you acquire the lock you add the file/size value to the dictionary, and when you are done you remove it. This way you will never get the error again while being able to compute the directory size all the same.
~Pino
I wish there was a File.ExistsAsync()
I have:
bool exists = await Task.Run(() => File.Exists(fileName));
Using a thread for this feels like an antipattern.
Is there a cleaner way?
There is no cleaner way than your solution.
The problems of race conditions aside I believe your solution can be used in some situations.
e.g.
I have static file content in many different folders. (in my case cshtml views,script files, css files, for mvc)
These files (which do not change much, during application execution) are always checked for in every request to the webserver, due to my application architecture, there are alot more places that files are checked for than in the default mvc application. So much so that file.exists takes up quite a portion of time each request.
so race conditions will generally not happen. The only interesting question for me is performance
starting a task with Task.Factory.StartNew() takes 0.002 ms (source Why so much difference in performance between Thread and Task?)
calling file.exists takes "0.006255ms when the file exists and 0.010925ms when the file does not exist." [Richard Harrison]
so by simple math calling the async File.Exists takes 0.008 ms up to 0.012 ms
in the best case async File.Exists takes 1.2 times as long as File.Exists and in the worst case it takes 1.3 times as long. (in my case most paths that are searched do not exist) so most of the time a File.Exists is mostly close to 0.01 ms
so it is not that much overhead, and you can utilize multiple cores/ harddisk controllers etc. more efficiently. With these calculations you can see that asynchroniously checking for existence of 2 files you will already have a performance increase of 1.6 in the worst case (0.02/ 0.012 )
well i'm just asyning async File.Exists is worth it in specific situations.
caveats of my post:
i might have not calculated everything correctly
i rounded alot
i did not measure performance on a single pc
i took performance from other posts
i just added the time of File.Exists and Task.Factory.StartNew() (this may be wrong)
i disregard alot of sideffects of multithreading
Long time since this thread, but I found it today...
ExistsAsync should definitely be a thing. In fact, in UWP, you have to use Async methods to find out if a file exists, as it could take longer than 50ms (anything that 'could' take longer than 50ms should be async in UWP language).
However, this is not UWP. The reason I need it is to check for folder.exists which if on a network share, remote disk, or idle disk would block the UI. So I can put all the messages like "checking..." but the UI wouldn't update without aysnc (or ViewModel, or timers, etc.)
bool exists = await Task.Run(() => File.Exists(fileName)); works perfectly. In my code, I have both (Exists and ExistsAsync) so that I can run Exists() when running on a non UI thread and don't have to worry about the overhead.
There isn't a File.ExistsAsync probably for good reason; because it makes absolutely no sense to have one; File.Exists is not going to take very long; I tested it as 0.006255ms when the file exists and 0.010925ms when the file does not exist.
There are a few times when it is sensible to call File.Exists; however usually I think the correct solution would be to open the file (thus preventing deletion), catching any exceptions - as there is no guarantee that the file will continue to exist after the call to File.Exists.
When you want to create a new file and not overwrite the old one :
File.Open("fn", FileMode.CreateNew)
For most of the use cases I can think of File.Open() (whether for existing or create new) is going to be better because once the call succeeds you will have a handle to the file and be able to do something with it. Even when using the file's existence as a flag I think I'd still open and close it. The only time I've really used File.Exists is when checking to see if a local HTML file is there before calling the browser so I can show a nice error message when it isn't.
The is no guarantee that something else won't delete the file after File.Exists; so if you did open it after checking with File.Exists the open call could still fail.
In my tests using a FileExists on network drive takes longer than File.Open, File.Exists takes 1.5967ms whereas File.OpenRead takes 0.3927ms)
Maybe if you could expand upon why you're doing this we'd be better able to answer; until then I'd say that you shouldn't do this
I have a friend who is in disagreement with me on this, and I'm just looking to get some feedback as to who is right and wrong in this situation.
FileInfo file = ...;
if (file.Exists)
{
//File somehow gets deleted
//Attempt to do stuff with file...
}
The problem my friend points out is that, "so what if the file exists when I check for existence? There is nothing to guard against the chance that right after the check the file gets deleted and attempts to access it result in an exception. So, is it even worth it to check for existence before-hand?"
The only thing I could come up with is that MSDN clearly does a check in their examples, so there must be more to it. MSDN - FileInfo. But, it does have me wondering... is the extra call even worth it?
I would have both if (file.Exists) and a try catch. Relying only on exception handling does not express explicitly what you have in mind. if (file.Exists) is self-explaining.
If someone deletes the file just in that millisecond between checking and working with the file, you can still get an exception. Nevertheless, there are also other conditions, which can lead to an exception: The file is read-only; you do not have the requested security permissions, and more.
I agree with your friend here for the most part (context depending on whether or not you have withheld pertinent information from your question). This is an example of an exception that can occur outside of your magnificent code. Checking for the existence of the file and performing your operation is a race condition.
The fact is that this exception can occur and there is NOTHING you can do to prevent it. You must catch it. It's completely out of your control. For example, what if the network goes down, lightning strikes your datacenter and it catches on fire, or a squirrel chews thru the cables? While it's not practical to try and figure out every single way in which the code will raise an exception, it is a good practice to do your best in situations where you know it's a good possibility and do your best to handle it.
I would say this depends on the context. if the file was just created and then this process ran, then it doesn't make sense to check if it exists. you can assume that it does because the code is still executing.
however if this is a file that is continuously deleted & created, then yes it does make sense to ensure it exists before continuing.
another factor is who/what is accessing the file. if there are multiple clients accessing the file then there is a greater chance of the file being modified/removed and therefore it would make sense to check if the file exists.
We have some C# code that reads data from a text file using a StreamReader. On one computer we can read data from the text file even after it has been deleted or replaced with a different text file - the File.Exists call reports that the file exists even when it doesn't in Windows Explorer. However, on another computer this behaviour doesn't happen. Both computers are running Vista Business and .NET 2.0.50727 SP2.
We have tried restarting the machine without a resolution.
Does anyone have any understanding on how this could be possible and information about possible solutions?
Thanks,
Alan
From MSDN
The Exists method should not be used for path validation, this method merely checks if the file specified in path exists.
Be aware that another process can potentially do something with the file in between the time you call the Exists method and perform another operation on the file, such as Delete. A recommended programming practice is to wrap the Exists method, and the operations you take on the file, in a try...catch block as shown in the example. This helps to narrow the scope for potential conflicts. The Exists method can only help to ensure that the file will be available, it cannot guarantee it.
Could this be a folder virtualization issue?
Is the file being opened for reading before it's being deleted? If it is, it's not unexpected to still be able to read from the opened file even after the filesystem has otherwise let it go.
RE: File.Exists():
File.Exists is inherently prone to race-conditions. It should not be used as the exclusive manner to verify that a file does or doesn't exist before performing some operation. This mistake can frequently result in a security flaw within your software.
Rather, always handle the exceptions that can be thrown from your actual file operations that open, etc, and verify your input once it's open.
Is there a way to bypass or remove the file lock held by another thread without killing the thread?
I am using a third-party library in my app that is performing read-only operations on a file. I need a second thread read the file at the same time to extract out some extra data the third-party library is not exposing. Unfortunately, the third-party library opened the file using a Read/Write lock and hence I am getting the usual "The process cannot access the file ... because it is being used by another process" exception.
I would like to avoid pre-loading the entire file with my thread because the file is large and would cause unwanted delays in the loading of this file and excess memory usage. Copying the file is not practical due to the size of the files. During normal operation, two threads hitting the same file would not cause any significant IO contention/performance problems. I don't need perfect time-synchronization between the two threads, but they need to be reading the same data within a half second of eachother.
I cannot change the third-party library.
Are there any work-arounds to this problem?
If you start messing with the underlying file handle you may be able to unlock portions, the trouble is that the thread accessing the file is not designed to handle this kind of tampering and may end up crashing.
My strong recommendation would be to patch the third party library, anything you do can and probably will blow up in real world conditions.
In short, you cannot do anything about the locking of the file by a third-party. You can get away with Richard E's answer above that mentions the utility Unlocker.
Once the third-party opens a file and sets the lock on it, the underlying system will give that third-party a lock to ensure no other process can access it. There are two trains of thought on this.
Using DLL injection to patch up the code to explicitly set the lock or unset it. This can be dangerous as you would be messing with another process's stability, and possibly end up crashing the process and rendering grief. Think about it, the underlying system is keeping track of files opened by a process..DLL injection at the point and patch up the code - this requires technical knowledge to determine which process you want to inject into at run-time and alter the flags upon interception of the Win32 API call OpenFile(...).
Since this was tagged as .NET, why not disassemble the source of the third-party into .il files, and alter the flag for the lock to shared, rebuild the library by recompiling all .il files back together into a DLL. This of course, would require to root around the code where the opening of the file is taking place in some class somewhere.
Have a look at the podcast here. And have a look here that explains how to do the second option highlighted above, here.
Hope this helps,
Best regards,
Tom.
This doesn't address your situation directly, but a tool like Unlocker acheieves what you're trying to do, but via a Windows UI.
Any low level hack to do this may result in a thread crashing, file corruption or etc.
Hence I thought I'd mention the next best thing, just wait your turn and poll until the file is not locked: https://stackoverflow.com/a/11060322/495455
I dont think this 2nd advice will help but the closest thing (that I know of) would be DemandReadFileIO:
IntSecurity.DemandReadFileIO(filename);
internal static void DemandReadFileIO(string fileName)
{
string full = fileName;
full = UnsafeGetFullPath(fileName);
new FileIOPermission(FileIOPermissionAccess.Read, full).Demand();
}
I do think that this is a problem that can be solved with c++. It is annoying but at least it works (as discussed here: win32 C/C++ read data from a "locked" file)
The steps are:
Open the file before the third-library with fsopen and the _SH_DENYNO flag
Open the file with the third-library
Read the file within your code
You may be interested in these links as well:
Calling c++ from c# (Possible to call C++ code from C#?)
The inner link from this post with a sample (http://blogs.msdn.com/b/borisj/archive/2006/09/28/769708.aspx)
Have you tried making a dummy copy of the file before your third-party library gets a hold of it... then using the actual copy for your manipulations, logically this would only be considered if the file we are talking about is fairly small. but it is a kind of a cheat :) good luck
If the file is locked and isn't being used, then you have a problem with the way your file locking/unlocking mechanism works. You should only lock a file when you are modifying it, and should then immediately unlock it to avoid situations like this.