I am trying to write a high performance file system searcher that can search unindexed drives (both local and network) very fast filtering on extensions and keywords. I am trying to achieve this using C#'s DirectoryInfo.EnumerateDirectories(), DirectoryInfo.EnumerateFiles() and LINQ queries. From my testing, this is (by far) the best performing code I could find:
FileInfo[] dirFiles = dirInfo.EnumerateDirectories()
.AsParallel()
.SelectMany(di => di.EnumerateFiles("*.*", SearchOption.AllDirectories)
.Where(fi => EndsWithExtension(fi.Extension)) )
.ToArray();
However, the UnauthorizedAccessException is not handled and when thrown, crashes the entire query.
I've tried various ways as outlined on SO related to this issue, but I found that they are significantly slower in search performance. This second best method I found working is over 20 times slower for example:
try {
foreach (string fileName in EnumerateFiles(dirInfo, "*.*", SearchOption.AllDirectories)) {
if (ContainsKeyword(fileName)) {
Results.Add(fileName.FullName);
}
}
} catch (Exception e) { continue; }
I would like to skip over the directory when it throws an exception. I've being trying to achieve this with something similar to this, but I can't get it to work (my knowledge of LINQ and Enumerables is too limited...):
FileInfo[] dirFiles = dirInfo.EnumerateDirectories()
.AsParallel()
.SelectMany(di => di.EnumerateFiles("*.*", SearchOption.AllDirectories)
.SkipExceptions()
.Where(fi => EndsWithExtension(fi.Extension)) )
.ToArray();
public static class Extensions {
public static IEnumerable<T> SkipExceptions<T>(this IEnumerable<T> values) {
using (var enumerator = values.GetEnumerator()) {
bool next = true;
while (next) {
try {
if (enumerator.Current != null)
Console.WriteLine(enumerator.Current.ToString());
next = enumerator.MoveNext();
} catch {
continue;
}
if (next) yield return enumerator.Current;
}
}
}
}
Is it possible to handle (UnauthorizedAccess) exceptions, while still remaining as high performant as the "raw" LINQ query?
Thanks in advance for your help!
Answer EDITED:
A workaround is to call it recursively instead of using SearchOption.AllDirectories. This is actually more inefficient in your case because you don't need to load every file in the filesystem into an array. Start with the following helper methods:
List<string> GetDirectoriesRecursive (string parent)
{
var directories = new List<string>();
GetDirectoriesRecursive (directories, parent);
return directories;
}
void GetDirectoriesRecursive (List<string> directories, string parent)
{
directories.Add (parent);
foreach (string child in GetAuthorizedDirectories (parent))
GetDirectoriesRecursive (directories, child);
}
string[] GetAuthorizedDirectories (string dir)
{
try { return Directory.GetDirectories (dir); }
catch (UnauthorizedAccessException) { return new string[0]; }
}
string[] GetAuthorizedFiles (string dir)
{
try { return Directory.GetFiles (dir); }
catch (UnauthorizedAccessException) { return new string[0]; }
}
Then, to get the big files:
var bigFiles =
from dir in GetDirectoriesRecursive ( #"c:\" )
from file in GetAuthorizedFiles (dir)
where new FileInfo (file).Length > 100000000
select file;
Or, to get just their directories:
var foldersWithBigFiles =
from dir in GetDirectoriesRecursive ( #"c:\" )
where GetAuthorizedFiles (dir).Any (f => new FileInfo (f).Length > 100000000 )
select dir;
ANOTHER APPROACH:
string[] directories = Directory.EnumerateDirectories(#"\\testnetwork\abc$","*.*", SearchOption.AllDirectories).Catch(typeof(UnauthorizedAccessException)).ToArray();
ADDED missing part:
static class ExceptionExtensions
{
public static IEnumerable<TIn> Catch<TIn>(
this IEnumerable<TIn> source,
Type exceptionType)
{
using (var e = source.GetEnumerator())
while (true)
{
var ok = false;
try
{
ok = e.MoveNext();
}
catch(Exception ex)
{
if (ex.GetType() != exceptionType)
throw;
continue;
}
if (!ok)
yield break;
yield return e.Current;
}
}
}
Related
I am trying to display a list of all files found in the selected directory (and optionally any subdirectories). The problem I am having is that when the GetFiles() method comes across a folder that it cannot access, it throws an exception and the process stops.
How do I ignore this exception (and ignore the protected folder/file) and continue adding accessible files to the list?
try
{
if (cbSubFolders.Checked == false)
{
string[] files = Directory.GetFiles(folderBrowserDialog1.SelectedPath);
foreach (string fileName in files)
ProcessFile(fileName);
}
else
{
string[] files = Directory.GetFiles(folderBrowserDialog1.SelectedPath, "*.*", SearchOption.AllDirectories);
foreach (string fileName in files)
ProcessFile(fileName);
}
lblNumberOfFilesDisplay.Enabled = true;
}
catch (UnauthorizedAccessException) { }
finally {}
You will have to do the recursion manually; don't use AllDirectories - look one folder at a time, then try getting the files from sub-dirs. Untested, but something like below (note uses a delegate rather than building an array):
using System;
using System.IO;
static class Program
{
static void Main()
{
string path = ""; // TODO
ApplyAllFiles(path, ProcessFile);
}
static void ProcessFile(string path) {/* ... */}
static void ApplyAllFiles(string folder, Action<string> fileAction)
{
foreach (string file in Directory.GetFiles(folder))
{
fileAction(file);
}
foreach (string subDir in Directory.GetDirectories(folder))
{
try
{
ApplyAllFiles(subDir, fileAction);
}
catch
{
// swallow, log, whatever
}
}
}
}
Since .NET Standard 2.1 (.NET Core 3+, .NET 5+), you can now just do:
var filePaths = Directory.EnumerateFiles(#"C:\my\files", "*.xml", new EnumerationOptions
{
IgnoreInaccessible = true,
RecurseSubdirectories = true
});
According to the MSDN docs about IgnoreInaccessible:
Gets or sets a value that indicates whether to skip files or directories when access is denied (for example, UnauthorizedAccessException or SecurityException). The default is true.
Default value is actually true, but I've kept it here just to show the property.
The same overload is available for DirectoryInfo as well.
This simple function works well and meets the questions requirements.
private List<string> GetFiles(string path, string pattern)
{
var files = new List<string>();
var directories = new string[] { };
try
{
files.AddRange(Directory.GetFiles(path, pattern, SearchOption.TopDirectoryOnly));
directories = Directory.GetDirectories(path);
}
catch (UnauthorizedAccessException) { }
foreach (var directory in directories)
try
{
files.AddRange(GetFiles(directory, pattern));
}
catch (UnauthorizedAccessException) { }
return files;
}
A simple way to do this is by using a List for files and a Queue for directories.
It conserves memory.
If you use a recursive program to do the same task, that could throw OutOfMemory exception.
The output: files added in the List, are organised according to the top to bottom (breadth first) directory tree.
public static List<string> GetAllFilesFromFolder(string root, bool searchSubfolders) {
Queue<string> folders = new Queue<string>();
List<string> files = new List<string>();
folders.Enqueue(root);
while (folders.Count != 0) {
string currentFolder = folders.Dequeue();
try {
string[] filesInCurrent = System.IO.Directory.GetFiles(currentFolder, "*.*", System.IO.SearchOption.TopDirectoryOnly);
files.AddRange(filesInCurrent);
}
catch {
// Do Nothing
}
try {
if (searchSubfolders) {
string[] foldersInCurrent = System.IO.Directory.GetDirectories(currentFolder, "*.*", System.IO.SearchOption.TopDirectoryOnly);
foreach (string _current in foldersInCurrent) {
folders.Enqueue(_current);
}
}
}
catch {
// Do Nothing
}
}
return files;
}
Steps:
Enqueue the root in the queue
In a loop, Dequeue it, Add the files in that directory to the list, and Add the subfolders to the queue.
Repeat untill the queue is empty.
see https://stackoverflow.com/a/10728792/89584 for a solution that handles the UnauthorisedAccessException problem.
All the solutions above will miss files and/or directories if any calls to GetFiles() or GetDirectories() are on folders with a mix of permissions.
Here's a full-featured, .NET 2.0-compatible implementation.
You can even alter the yielded List of files to skip over directories in the FileSystemInfo version!
(Beware null values!)
public static IEnumerable<KeyValuePair<string, string[]>> GetFileSystemInfosRecursive(string dir, bool depth_first)
{
foreach (var item in GetFileSystemObjectsRecursive(new DirectoryInfo(dir), depth_first))
{
string[] result;
var children = item.Value;
if (children != null)
{
result = new string[children.Count];
for (int i = 0; i < result.Length; i++)
{ result[i] = children[i].Name; }
}
else { result = null; }
string fullname;
try { fullname = item.Key.FullName; }
catch (IOException) { fullname = null; }
catch (UnauthorizedAccessException) { fullname = null; }
yield return new KeyValuePair<string, string[]>(fullname, result);
}
}
public static IEnumerable<KeyValuePair<DirectoryInfo, List<FileSystemInfo>>> GetFileSystemInfosRecursive(DirectoryInfo dir, bool depth_first)
{
var stack = depth_first ? new Stack<DirectoryInfo>() : null;
var queue = depth_first ? null : new Queue<DirectoryInfo>();
if (depth_first) { stack.Push(dir); }
else { queue.Enqueue(dir); }
for (var list = new List<FileSystemInfo>(); (depth_first ? stack.Count : queue.Count) > 0; list.Clear())
{
dir = depth_first ? stack.Pop() : queue.Dequeue();
FileSystemInfo[] children;
try { children = dir.GetFileSystemInfos(); }
catch (UnauthorizedAccessException) { children = null; }
catch (IOException) { children = null; }
if (children != null) { list.AddRange(children); }
yield return new KeyValuePair<DirectoryInfo, List<FileSystemInfo>>(dir, children != null ? list : null);
if (depth_first) { list.Reverse(); }
foreach (var child in list)
{
var asdir = child as DirectoryInfo;
if (asdir != null)
{
if (depth_first) { stack.Push(asdir); }
else { queue.Enqueue(asdir); }
}
}
}
}
This should answer the question. I've ignored the issue of going through subdirectories, I'm assuming you have that figured out.
Of course, you don't need to have a seperate method for this, but you might find it a useful place to also verify the path is valid, and deal with the other exceptions that you could encounter when calling GetFiles().
Hope this helps.
private string[] GetFiles(string path)
{
string[] files = null;
try
{
files = Directory.GetFiles(path);
}
catch (UnauthorizedAccessException)
{
// might be nice to log this, or something ...
}
return files;
}
private void Processor(string path, bool recursive)
{
// leaving the recursive directory navigation out.
string[] files = this.GetFiles(path);
if (null != files)
{
foreach (string file in files)
{
this.Process(file);
}
}
else
{
// again, might want to do something when you can't access the path?
}
}
I prefer using c# framework functions, but the function i need will be included in .net framework 5.0, so i have to write it.
// search file in every subdirectory ignoring access errors
static List<string> list_files(string path)
{
List<string> files = new List<string>();
// add the files in the current directory
try
{
string[] entries = Directory.GetFiles(path);
foreach (string entry in entries)
files.Add(System.IO.Path.Combine(path,entry));
}
catch
{
// an exception in directory.getfiles is not recoverable: the directory is not accessible
}
// follow the subdirectories
try
{
string[] entries = Directory.GetDirectories(path);
foreach (string entry in entries)
{
string current_path = System.IO.Path.Combine(path, entry);
List<string> files_in_subdir = list_files(current_path);
foreach (string current_file in files_in_subdir)
files.Add(current_file);
}
}
catch
{
// an exception in directory.getdirectories is not recoverable: the directory is not accessible
}
return files;
}
I am busy creating a file/folder indexing Windows Forms Application in VS2010(Assignment for college). For testing purposes the file/folder indexing class is in a console application
I use the following to go through folders, it runs fine and writes to file all the folder names in the specified drive. I've thrown this together from mainly the msdn resources(used the recursive method), and modified since it didn't include getting folder names.
I want to exclude certain folders, and decided to use a lambda expression and List with list of words will be fastest, although I could just place a loop that goes through an array with an if comparison, but to my mind this would be slower(not that I understand enough about intricate workings in c#). I've had a brief look at lambda expressions to see if I can't fix it myself.
Here is my code working without any folder exclusion
class Program
{
static System.Collections.Specialized.StringCollection log = new System.Collections.Specialized.StringCollection();
private static List<string> _excludedDirectories = new List<string>() { "Windows", "AppData", "$WINDOWS.~BT", "MSOCache", "ProgramData", "Config.Msi", "$Recycle.Bin", "Recovery", "System Volume Information", "Documents and Settings", "Perflogs" };
//method to check
static bool isExcluded(List<string> exludedDirList, string target)
{
return exludedDirList.Any(d => new DirectoryInfo(target).Name.Equals(d));
}
static void Main()
{
string[] drives = {"C:\\"};
foreach (string dr in drives)
{
DriveInfo di = new System.IO.DriveInfo(dr);
// Here we skip the drive if it is not ready to be read.
if (di.IsReady)
{
DirectoryInfo rootDir = di.RootDirectory;
WalkDirectoryTree(rootDir);
}
else
{
Console.WriteLine("The drive {0} could not be read", di.Name);
continue;
}
}
// Write out all the files that could not be processed.
Console.WriteLine("Files with restricted access:");
foreach (string s in log)
{
Console.WriteLine(s);
}
// Keep the console window open in debug mode.
Console.WriteLine("Press any key");
Console.ReadKey();
}
static void WalkDirectoryTree(System.IO.DirectoryInfo root)
{
FileInfo[] files = null;
DirectoryInfo[] subDirs = null;
StreamWriter filex = new System.IO.StreamWriter("test.txt", true);
if (filex != null)
{
filex.Close();
}
// Process all the folders directly under the root
try
{
subDirs = root.GetDirectories();
}// This is thrown if even one of the folders requires permissions greater than the application provides.
catch (UnauthorizedAccessException e)
{
log.Add(e.Message);
}
catch (System.IO.DirectoryNotFoundException e)
{
Console.WriteLine(e.Message);
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
// Process all the files directly under the root
try
{
files = root.GetFiles("*.*");
}// This is thrown if even one of the files requires permissions greater than the application provides.
catch (UnauthorizedAccessException e)
{
log.Add(e.Message);
}
catch (System.IO.DirectoryNotFoundException e)
{
Console.WriteLine(e.Message);
}
catch(Exception e)
{
Console.WriteLine(e.Message);
}
if (files != null)
{
filex = new StreamWriter("test.txt", true);
foreach (FileInfo fi in files)
{
// In this example, we only access the existing FileInfo object. If we
// want to open, delete or modify the file, then
// a try-catch block is required here to handle the case
// where the file has been deleted since the call to TraverseTree().
Console.WriteLine(fi.FullName);
filex.WriteLine(fi.FullName);
}
filex.Close();
}
if (subDirs != null)
{
//var filteredDirs = Directory.GetDirectories(root.Name).Where(d => !isExcluded(_excludedDirectories, d));
foreach (DirectoryInfo subds in subDirs)
{
filex = new StreamWriter("test.txt", true);
Console.WriteLine(subds.FullName);
filex.WriteLine(subds.FullName);
filex.Close();
foreach (DirectoryInfo dirInfo in subDirs)
{
// Resursive call for each subdirectory.
WalkDirectoryTree(dirInfo);
}
}
filex.Close();// Because at end filestream needs to close
}
}
}
So I tried incorporating the .Where(d => !isExcluded(_excludedDirectories, d)) into my loop:
if (subDirs != null)
{
//var filteredDirs = Directory.GetDirectories(root.Name).Where(d => !isExcluded(_excludedDirectories, d));
foreach (DirectoryInfo subds in subDirs.Where(d => !isExcluded(_excludedDirectories, d)))
{
filex = new StreamWriter("test.txt", true);
Console.WriteLine(subds.FullName);
filex.WriteLine(subds.FullName);
filex.Close();
foreach (DirectoryInfo dirInfo in subDirs)
{
// Resursive call for each subdirectory.
WalkDirectoryTree(dirInfo);
}
}
filex.Close();// Because at end filestream needs to close
}
PROBLEM: I get an error from after the exclamation saying "The best overloaded method match has some invalid args..." What should I do/change, should I take the simpler route and use a loop and if statement in my loop that writes the folder names? Because I also understand how to do that. And remember the way I'm currently doing(trying to at least) is because I thought it would be more optimised/faster. If it doesn't make such a great difference, let me know and I will use the way I know.
My guess is that I'm doing a bad thing by trying to put a .where there in the foreach, and I realise why it is or might be.
I've also tried:
if (subDirs != null)
{
//var filteredDirs = Directory.GetDirectories(root.Name).Where(d => !isExcluded(_excludedDirectories, d));
foreach (DirectoryInfo subds in subDirs)
{
if ((d => !isExcluded(_excludedDirectories, d)))
{
filex = new StreamWriter("test.txt", true);
Console.WriteLine(subds.FullName);
filex.WriteLine(subds.FullName);
filex.Close();
foreach (DirectoryInfo dirInfo in subDirs)
{
// Resursive call for each subdirectory.
WalkDirectoryTree(dirInfo);
}
}
}
filex.Close();// Because at end filestream needs to close
}
but get an error about cannot convert lamba expression to type bool because it is not a delegate
Let me know if you want to see the other code, and I will then add it, just seems a bit much.
d is not a string here, it's a DirectoryInfo. Change your isExcluded method signature to deal with the type of d properly.
Your signature is:
static bool isExcluded(List<string> exludedDirList, string target)
It should be:
static bool isExcluded(List<string> exludedDirList, DirectoryInfo target)
And your method will end up being:
//method to check
static bool isExcluded(List<string> exludedDirList, DirectoryInfo target)
{
return exludedDirList.Any(d => target.Name.Equals(d));
}
The problem is here:
foreach (DirectoryInfo subds in subDirs.Where(d => !isExcluded(_excludedDirectories, d)))
subDirs is of type DirectoryInfo, your isExcluded takes a string as the second argument.
You want:
foreach (DirectoryInfo subds in subDirs.Where(d => !isExcluded(_excludedDirectories, d.Name)))
So, I make a program to make a backup of some particular file, with particular extensions, so, I enter with List or array with the extensions of I whant to make a backup
List<string> extensions = new List<string>();
extensions.Add("*.pdf");
extensions.Add("*.txt");
extensions.Add("*.inf");
extensions.Add("*.doc");
extensions.Add("*.cpp");
extensions.Add("*.cs");
extensions.Add("*.vb");
Ok, but, how I can make the search system, to find the files with that extensions in folders..
the search system is simple:
public void DirSearch(string sDir)
{
try
{
foreach (string d in Directory.GetDirectories(sDir))
{
foreach (string f in Directory.GetFiles(d, "*.pdf"))
{
Console.WriteLine(f);
}
DirSearch(d);
}
}
catch (System.Exception excpt)
{
Console.WriteLine(excpt.Message);
}
}
Ok, but, how I can make this to search for all extensions in list ( to make a most rapid system ), and, the program can not enter on windows folder..., if I set the sDir = "C:\"
To simply extend what you have you're going to need another loop, one that iterates the extensions and calls GetFiles based on the current, though "to make a most rapid system" is highly ambitious at this level. Anyway,...
foreach (string d in Directory.GetDirectories(sDir)) {
foreach (string e in extensions) {
foreach (string f in Directory.GetFiles(d, e)) {
}
}
}
Use overload GetFile with option SearchOption.AllDirectories, so you don't need to call recursively, also use LINQ with SelectMany:
var result = extensions.SelectMany(e =>
Directory.GetFiles(sDir, e, SearchOption.AllDirectories));
Update: To ignore protected folder, you can use try catch to skip exception:
private string[] GetFiles(string directory, string pattern)
{
try
{
return Directory.GetFiles(directory, pattern,
SearchOption.AllDirectories);
}
catch (Exception)
{
return new string[0];
}
}
So:
var result = extensions.SelectMany(e => GetFiles(sDir, e));
You don't need an extra loop or recursion, just use overloaded Directory.GetFiles which will get all file names recursively
public void DirSearch(string sDir)
{
try
{
foreach (string item in extensions)
{
string[] files = Directory.GetFiles(sDir, item, SearchOption.AllDirectories);
foreach (var file in files)
{
Console.WriteLine(file);
}
}
}
catch (System.Exception excpt)
{
Console.WriteLine(excpt.Message);
}
}
You can supply multiple search criteria but seperating the value with a semi-colon
*.vb;*.txt
So you can simply generate the search string using the list
var search = string.Join(";", extensions.ToArray());
...
foreach (string f in Directory.GetFiles(d, search))
{
...
}
It may be slightly more performant to call GetFiles(string Directory) and get the results back into a single list, and then parse that. The following snippet should do what you need...
var extensions = new List<string> { ".pdf", ".txt", ".inf", ".doc", ".cpp", ".cs", ".vb" };
var files = Directory.GetFiles(topLevelFolder, "*.*", SearchOption.AllDirectories);
var matching = new List<string>();
foreach (var ext in extensions)
{
matching.AddRange(files.Where(f => f.EndsWith(ext)));
}
I have recently had a need to Enumerate an entire file system looking for specific types of files for auditing purposes. This has caused me to run into several exceptions due to having limited permissions on the file system to be scanned. Among them, the most prevalent have been UnauthorizedAccessException and much to my chagrin, PathTooLongException.
These would not normally be an issue except that they invalidate the IEnumerable, preventing me from being able to complete the scan.
In order to solve this problem, I have created a replacement File System Enumerator. Although it may not be perfect, it performs fairly quickly and traps the two exceptions that I have run into. It will find any directories or files that match the search pattern passed to it.
// This code is public domain
using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using log4net;
public class FileSystemEnumerable : IEnumerable<FileSystemInfo>
{
private ILog _logger = LogManager.GetLogger(typeof(FileSystemEnumerable));
private readonly DirectoryInfo _root;
private readonly IList<string> _patterns;
private readonly SearchOption _option;
public FileSystemEnumerable(DirectoryInfo root, string pattern, SearchOption option)
{
_root = root;
_patterns = new List<string> { pattern };
_option = option;
}
public FileSystemEnumerable(DirectoryInfo root, IList<string> patterns, SearchOption option)
{
_root = root;
_patterns = patterns;
_option = option;
}
public IEnumerator<FileSystemInfo> GetEnumerator()
{
if (_root == null || !_root.Exists) yield break;
IEnumerable<FileSystemInfo> matches = new List<FileSystemInfo>();
try
{
_logger.DebugFormat("Attempting to enumerate '{0}'", _root.FullName);
foreach (var pattern in _patterns)
{
_logger.DebugFormat("Using pattern '{0}'", pattern);
matches = matches.Concat(_root.EnumerateDirectories(pattern, SearchOption.TopDirectoryOnly))
.Concat(_root.EnumerateFiles(pattern, SearchOption.TopDirectoryOnly));
}
}
catch (UnauthorizedAccessException)
{
_logger.WarnFormat("Unable to access '{0}'. Skipping...", _root.FullName);
yield break;
}
catch (PathTooLongException ptle)
{
_logger.Warn(string.Format(#"Could not process path '{0}\{1}'.", _root.Parent.FullName, _root.Name), ptle);
yield break;
} catch (System.IO.IOException e)
{
// "The symbolic link cannot be followed because its type is disabled."
// "The specified network name is no longer available."
_logger.Warn(string.Format(#"Could not process path (check SymlinkEvaluation rules)'{0}\{1}'.", _root.Parent.FullName, _root.Name), e);
yield break;
}
_logger.DebugFormat("Returning all objects that match the pattern(s) '{0}'", string.Join(",", _patterns));
foreach (var file in matches)
{
yield return file;
}
if (_option == SearchOption.AllDirectories)
{
_logger.DebugFormat("Enumerating all child directories.");
foreach (var dir in _root.EnumerateDirectories("*", SearchOption.TopDirectoryOnly))
{
_logger.DebugFormat("Enumerating '{0}'", dir.FullName);
var fileSystemInfos = new FileSystemEnumerable(dir, _patterns, _option);
foreach (var match in fileSystemInfos)
{
yield return match;
}
}
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
The usage is fairly simple.
//This code is public domain
var root = new DirectoryInfo(#"c:\wherever");
var searchPattern = #"*.txt";
var searchOption = SearchOption.AllDirectories;
var enumerable = new FileSystemEnumerable(root, searchPattern, searchOption);
People are free to use it if they find it useful.
Here's another way, manage your own enumeration iteration:
IEnumerator<string> errFiles=Directory.EnumerateFiles(baseDir, "_error.txt", SearchOption.AllDirectories).GetEnumerator();
while (true)
{
try
{
if (!errFiles.MoveNext())
break;
string errFile = errFiles.Current;
// processing
} catch (Exception e)
{
log.Warn("Ignoring error finding in: " + baseDir, e);
}
}
I am trying to display a list of all files found in the selected directory (and optionally any subdirectories). The problem I am having is that when the GetFiles() method comes across a folder that it cannot access, it throws an exception and the process stops.
How do I ignore this exception (and ignore the protected folder/file) and continue adding accessible files to the list?
try
{
if (cbSubFolders.Checked == false)
{
string[] files = Directory.GetFiles(folderBrowserDialog1.SelectedPath);
foreach (string fileName in files)
ProcessFile(fileName);
}
else
{
string[] files = Directory.GetFiles(folderBrowserDialog1.SelectedPath, "*.*", SearchOption.AllDirectories);
foreach (string fileName in files)
ProcessFile(fileName);
}
lblNumberOfFilesDisplay.Enabled = true;
}
catch (UnauthorizedAccessException) { }
finally {}
You will have to do the recursion manually; don't use AllDirectories - look one folder at a time, then try getting the files from sub-dirs. Untested, but something like below (note uses a delegate rather than building an array):
using System;
using System.IO;
static class Program
{
static void Main()
{
string path = ""; // TODO
ApplyAllFiles(path, ProcessFile);
}
static void ProcessFile(string path) {/* ... */}
static void ApplyAllFiles(string folder, Action<string> fileAction)
{
foreach (string file in Directory.GetFiles(folder))
{
fileAction(file);
}
foreach (string subDir in Directory.GetDirectories(folder))
{
try
{
ApplyAllFiles(subDir, fileAction);
}
catch
{
// swallow, log, whatever
}
}
}
}
Since .NET Standard 2.1 (.NET Core 3+, .NET 5+), you can now just do:
var filePaths = Directory.EnumerateFiles(#"C:\my\files", "*.xml", new EnumerationOptions
{
IgnoreInaccessible = true,
RecurseSubdirectories = true
});
According to the MSDN docs about IgnoreInaccessible:
Gets or sets a value that indicates whether to skip files or directories when access is denied (for example, UnauthorizedAccessException or SecurityException). The default is true.
Default value is actually true, but I've kept it here just to show the property.
The same overload is available for DirectoryInfo as well.
This simple function works well and meets the questions requirements.
private List<string> GetFiles(string path, string pattern)
{
var files = new List<string>();
var directories = new string[] { };
try
{
files.AddRange(Directory.GetFiles(path, pattern, SearchOption.TopDirectoryOnly));
directories = Directory.GetDirectories(path);
}
catch (UnauthorizedAccessException) { }
foreach (var directory in directories)
try
{
files.AddRange(GetFiles(directory, pattern));
}
catch (UnauthorizedAccessException) { }
return files;
}
A simple way to do this is by using a List for files and a Queue for directories.
It conserves memory.
If you use a recursive program to do the same task, that could throw OutOfMemory exception.
The output: files added in the List, are organised according to the top to bottom (breadth first) directory tree.
public static List<string> GetAllFilesFromFolder(string root, bool searchSubfolders) {
Queue<string> folders = new Queue<string>();
List<string> files = new List<string>();
folders.Enqueue(root);
while (folders.Count != 0) {
string currentFolder = folders.Dequeue();
try {
string[] filesInCurrent = System.IO.Directory.GetFiles(currentFolder, "*.*", System.IO.SearchOption.TopDirectoryOnly);
files.AddRange(filesInCurrent);
}
catch {
// Do Nothing
}
try {
if (searchSubfolders) {
string[] foldersInCurrent = System.IO.Directory.GetDirectories(currentFolder, "*.*", System.IO.SearchOption.TopDirectoryOnly);
foreach (string _current in foldersInCurrent) {
folders.Enqueue(_current);
}
}
}
catch {
// Do Nothing
}
}
return files;
}
Steps:
Enqueue the root in the queue
In a loop, Dequeue it, Add the files in that directory to the list, and Add the subfolders to the queue.
Repeat untill the queue is empty.
see https://stackoverflow.com/a/10728792/89584 for a solution that handles the UnauthorisedAccessException problem.
All the solutions above will miss files and/or directories if any calls to GetFiles() or GetDirectories() are on folders with a mix of permissions.
Here's a full-featured, .NET 2.0-compatible implementation.
You can even alter the yielded List of files to skip over directories in the FileSystemInfo version!
(Beware null values!)
public static IEnumerable<KeyValuePair<string, string[]>> GetFileSystemInfosRecursive(string dir, bool depth_first)
{
foreach (var item in GetFileSystemObjectsRecursive(new DirectoryInfo(dir), depth_first))
{
string[] result;
var children = item.Value;
if (children != null)
{
result = new string[children.Count];
for (int i = 0; i < result.Length; i++)
{ result[i] = children[i].Name; }
}
else { result = null; }
string fullname;
try { fullname = item.Key.FullName; }
catch (IOException) { fullname = null; }
catch (UnauthorizedAccessException) { fullname = null; }
yield return new KeyValuePair<string, string[]>(fullname, result);
}
}
public static IEnumerable<KeyValuePair<DirectoryInfo, List<FileSystemInfo>>> GetFileSystemInfosRecursive(DirectoryInfo dir, bool depth_first)
{
var stack = depth_first ? new Stack<DirectoryInfo>() : null;
var queue = depth_first ? null : new Queue<DirectoryInfo>();
if (depth_first) { stack.Push(dir); }
else { queue.Enqueue(dir); }
for (var list = new List<FileSystemInfo>(); (depth_first ? stack.Count : queue.Count) > 0; list.Clear())
{
dir = depth_first ? stack.Pop() : queue.Dequeue();
FileSystemInfo[] children;
try { children = dir.GetFileSystemInfos(); }
catch (UnauthorizedAccessException) { children = null; }
catch (IOException) { children = null; }
if (children != null) { list.AddRange(children); }
yield return new KeyValuePair<DirectoryInfo, List<FileSystemInfo>>(dir, children != null ? list : null);
if (depth_first) { list.Reverse(); }
foreach (var child in list)
{
var asdir = child as DirectoryInfo;
if (asdir != null)
{
if (depth_first) { stack.Push(asdir); }
else { queue.Enqueue(asdir); }
}
}
}
}
This should answer the question. I've ignored the issue of going through subdirectories, I'm assuming you have that figured out.
Of course, you don't need to have a seperate method for this, but you might find it a useful place to also verify the path is valid, and deal with the other exceptions that you could encounter when calling GetFiles().
Hope this helps.
private string[] GetFiles(string path)
{
string[] files = null;
try
{
files = Directory.GetFiles(path);
}
catch (UnauthorizedAccessException)
{
// might be nice to log this, or something ...
}
return files;
}
private void Processor(string path, bool recursive)
{
// leaving the recursive directory navigation out.
string[] files = this.GetFiles(path);
if (null != files)
{
foreach (string file in files)
{
this.Process(file);
}
}
else
{
// again, might want to do something when you can't access the path?
}
}
I prefer using c# framework functions, but the function i need will be included in .net framework 5.0, so i have to write it.
// search file in every subdirectory ignoring access errors
static List<string> list_files(string path)
{
List<string> files = new List<string>();
// add the files in the current directory
try
{
string[] entries = Directory.GetFiles(path);
foreach (string entry in entries)
files.Add(System.IO.Path.Combine(path,entry));
}
catch
{
// an exception in directory.getfiles is not recoverable: the directory is not accessible
}
// follow the subdirectories
try
{
string[] entries = Directory.GetDirectories(path);
foreach (string entry in entries)
{
string current_path = System.IO.Path.Combine(path, entry);
List<string> files_in_subdir = list_files(current_path);
foreach (string current_file in files_in_subdir)
files.Add(current_file);
}
}
catch
{
// an exception in directory.getdirectories is not recoverable: the directory is not accessible
}
return files;
}