Currently I am using the following code to search for files in a folder:
public string[] getFiles(string SourceFolder, string Filter,System.IO.SearchOption searchOption)
{
// ArrayList will hold all file names
ArrayList alFiles = new ArrayList();
// Create an array of filter string
string[] MultipleFilters = Filter.Split('|');
// for each filter find mathing file names
foreach (string FileFilter in MultipleFilters)
{
// add found file names to array list
alFiles.AddRange(Directory.GetFiles(SourceFolder, FileFilter, searchOption));
}
// returns string array of relevant file names
return (string[])alFiles.ToArray(typeof(string));
}
The problem is that when I pass a drive like D:\\ as the path to search, either I get an exception in GetFiles() or nothing is found!
I also get exceptions when I try to access some hidden or system secured folder.
How can I properly search for files in a drive or folder recursively?
One more thing, I come to know that a extension like "abc" may return files with having extensions like "abcd" or "abcde".
If this is true, how can I overcome this problem?
Thank you.
Related
I am looking for a library that allows me to take something similar to this:
./*
./*/*.*
./**
./**/*.*
./*.cs
./folder1/*.png
Then i would need to pass it to a method which scans the filesystem and returns these paths:
C:\folder1, C:\folder2, C:\folder3
C:\folder1\file.cs,C:\folder1\test.dll,[...], C:\folder2\image.png, [...]
C:\folder1, C:\folder1\results, C:\folder2, [...]
C:\file1.cs, C:\file2.cs, C:\file3.cs
C:\folder1\image1.png, C:\folder1\image2.png, [...]
I am aware that Directory.GetFiles() and Directory.GetDirectories exist and that they accept a filter, but i need one method that does this recursively and flexibly and returns a set of absolute paths.
One option is to use a library like Glob library which unlike .net file globbing does folders with wild cards.
var folder = #"Your path goes here";
string[] folderPatterns = { "**/bin", "**/obj" };
string[] filePatterns = { "**/*cs"};
var results = new List<string>();
foreach (var pattern in folderPatterns)
{
results.AddRange(Glob.Directories(folder, pattern).ToArray());
}
foreach (var pattern in filePatterns)
{
results.AddRange(Glob.Files(folder, pattern).ToArray());
}
I'm working on a program that is supposed to scan a specific directory looking for any directories within it that have specific names, and if it finds them, tell the user.
Currently, the way I am loading the names its searching for is like this:
static string path = Path.Combine(Directory.GetCurrentDirectory(), #"database.txt");
static string[] database = File.ReadAllLines(datapath);
I am using this as an array of names to look for when looking through a specific directory. I am doing so with a foreach method.
System.IO.DirectoryInfo di = new DirectoryInfo("C:\ExampleDirectory");
foreach (DirectoryInfo dir in di.GetDirectories())
{
}
Is there a way to see if any of the names in the file "database.txt" match any names of directories found within "C:\ExampleDirectory"?
The only way I can think of doing this is:
System.IO.DirectoryInfo di = new DirectoryInfo(versionspath);
foreach (DirectoryInfo dir in di.GetDirectories())
{
if(dir.Name == //Something...) {
Console.WriteLine("Match found!");
break;}
}
But this obviously won't work, and I cannot think of any other way to do this. Any help would be greatly appreciated!
Based on your other questions on stackoverflow, I presume your question is a homework or you are a passionate hobby programmer, am I right? So I'll try to explain the principle here continuing your almost complete solution.
You will need a nested loop here, a loop in a loop. In the outer loop you iterate through the directories. You already got this one. For each directory you need to loop through the names in database to see if any item in it matches the name of the directory:
System.IO.DirectoryInfo di = new DirectoryInfo(versionspath);
foreach (DirectoryInfo dir in di.GetDirectories())
{
foreach (string name in database)
{
if (dir.Name == name)
{
Console.WriteLine("Match found!");
break;
}
}
}
Depending on your goal, you might want to exit at the first matching directory. The sample code above doesn't. The single break; statement only exits the inner loop, not the outer one. So it continues to check the next directory. Try to figure it out yourself how to stop at the first match (by exiting the outer loop).
As usual, LINQ is the way to go. Whenever you have to find matches or not-matches between two lists and both lists containing different types, you'll have to use .Join() or .GroupJoin().
The .Join() comes into play, if you need to find a 1:1 relationship and the .GroupJoin() for any kind of 1-to relationship (1:0, 1:many or also 1:1).
So, if you need the directories that match your list, this sounds for a job to the .Join() operator:
public static void Main(string[] args)
{
// Where ever this comes normally from.
string[] database = new[] { "fOo", "bAr" };
string startDirectory = #"D:\baseFolder";
// A method that returns an IEnumerable<string>
// Using maybe a recursive approach to get all directories and/or files
var candidates = LoadCandidates(startDirectory);
var matches = database.Join(
candidates,
// Simply pick the database entry as is.
dbEntry => dbEntry,
// Only take the last portion of the given path.
fullPath => Path.GetFileName(fullPath),
// Return only the full path from the given matching pair.
(dbEntry, fullPath) => fullPath,
// Ignore case on comparison.
StringComparer.OrdinalIgnoreCase);
foreach (var match in matches)
{
// Shows "D:\baseFolder\foo"
Console.WriteLine(match);
}
Console.ReadKey();
}
private static IEnumerable<string> LoadCandidates(string baseFolder)
{
return new[] { #"D:\baseFolder\foo", #"D:\basefolder\baz" };
//return Directory.EnumerateDirectories(baseFolder, "*", SearchOption.AllDirectories);
}
You can use LINQ to do this
var allDirectoryNames = di.GetDirectories().Select(d => d.Name);
var matches = allDirectoryNames.Intersect(database);
if (matches.Any())
Console.WriteLine("Matches found!");
In the first line we get all the directory names, then we use the Intersect() method to see which ones are present in both allDirectoryNames and database
I am having a little trouble getting files into a string[]. Everything seems to be ok until I have a .docx and doc file and a .xlsx and a .xls file in my directory that I am searching. Can someone advise me on how to acheive this?
Please see my code that I have so far below:
Filter = ".DOC|.DOCX|.XLS|.XLSX|.PDF|.TXT|.TIF|.TIFF"
public string[] getFiles(string SourceFolder, string Filter)
{
// ArrayList will hold all file names
System.Collections.ArrayList alFiles = new System.Collections.ArrayList();
// Create an array of filter string
string[] MultipleFilters = Filter.Split('|');
// for each filter find mathing file names
foreach (string FileFilter in MultipleFilters)
{
// add found file names to array list
alFiles.AddRange(Directory.GetFiles(SourceFolder, FileFilter));
}
// returns string array of relevant file names
return (string[])alFiles.ToArray(typeof(string));
}
Thanks,
George
You can take advantage of LINQ's Distinct() (System.Linq).
Returns distinct elements from a sequence by using the default equality comparer to compare values.
Filter = ".DOC|.DOCX|.XLS|.XLSX|.PDF|.TXT|.TIF|.TIFF";
public string[] GetFiles(string SourceFolder, string Filter)
{
List<string> alFiles = new List<string>();
string[] MultipleFilters = Filter.Split('|');
foreach (string FileFilter in MultipleFilters)
{
alFiles.AddRange(Directory.GetFiles(SourceFolder, FileFilter));
}
return alFiles.Distinct().ToArray();
}
Notice that I am now creating a new List<string> instance (System.Collections.Generic), instead of your ArrayList
First off, the code as originally posted doesn't return any files, because none of the calls to Directory.GetFiles() include a wildcard in the filter.
Second, assuming that the original filter did include wildcards, there's a nasty little surprise in the MSDN Directory.GetFiles(string, string) documentation:
When you use the asterisk wildcard character in a searchPattern such
as "*.txt", the number of characters in the specified extension
affects the search as follows:
•If the specified extension is exactly three characters long, the
method returns files with extensions that begin with the specified
extension. For example, "*.xls" returns both "book.xls" and
"book.xlsx".
•In all other cases, the method returns files that exactly match the
specified extension. For example, "*.ai" returns "file.ai" but not
"file.aif".
(emphasis added)
Rather than trying to work around the "helpful" behavior of the Directory.GetFiles(string, string) overload, I'd use the Directory.GetFiles(string) overload to get all the files and then filter the results using LINQ:
public string[] getFiles(string SourceFolder, string Filter)
{
string[] MultipleFilters = Filter.Split('|');
var SelectedFiles = Directory.GetFiles(SourceFolder)
.Where(f => MultipleFilters.Contains(Path.GetExtension(f).ToUpper()))
.Select(f => f);
return SelectedFiles.ToArray();
}
If there are huge number of files in the folder then getting all files could cause memory problem.
In below code I am searching files based on wild card filter and then filtering them using LINQ :)
string Filter = ".DOC|.DOCX|.XLS|.XLSX|.PDF|.TXT|.TIF|.TIFF" //without "*"
public string[] getFiles(string SourceFolder, string Filter)
{
var filters = Filter.ToUpper().Split('|');
return filters.SelectMany(filter => System.IO.Directory.GetFiles(SourceFolder, "*"+filter)).Where(file=> filters.Contains(Path.GetExtension(file).ToUpper())).ToArray();
}
I am working in C#. I have a segment of code that returns the file as well as path of a specific file type and places them inside a select list
private void Form1_Load(object sender, EventArgs e)
{
// Only get .sde files
string[] dirs = System.IO.Directory.GetFiles(#"c:\Users\JohnDoe\Desktop\my_files", "*.sde");
this.GetSdePath.Items.AddRange(dirs);
}
When I run my program, the select list contains all the sde files. They are listed/displayed as such:
c:\Users\JohnDoe\Desktop\my_files\NewCreated.sde
c:\Users\JohnDoe\Desktop\my_files\Inventory.sde
c:\Users\JohnDoe\Desktop\my_files\Surplus.sde
c:\Users\JohnDoe\Desktop\my_files\Logistics.sde
I am wondering if in my select list is it possible to hide the path and just display the name of the sde file. So the list would look like
NewCreated.sde
Inventory.sde
Surplus.sde
Logistics.sde
BUT, each value in the list would return the full path and name.
Any help on this topic would be greatly appreciated. Thanks in advance.
Use Path.GetFileName(string path)
private void Form1_Load(object sender, EventArgs e)
{
// Only get .sde files
string[] dirs = System.IO.Directory.GetFiles(#"c:\Users\JohnDoe\Desktop\my_files", "*.sde");
this.GetSdePath.Items.AddRange(dirs.Select(path => Path.GetFileName(path).ToArray());
}
Using Select on the sequence returned to apply the Path.GetFileName method that extracts just the filename from the fullpath
var dirs = System.IO.Directory.GetFiles(#"c:\Users\JohnDoe\Desktop\my_files", "*.sde")
.Select (d => Path.GetFileName(d));
this.GetSdePath.Items.AddRange(dirs.ToArray());
I don't know how many files are present in your folder but probably it is better to use EnumerateFiles instead of GetFiles
var dirs = System.IO.Directory.EnumerateFiles(#"c:\Users\JohnDoe\Desktop\my_files", "*.sde")
.Select (d => Path.GetFileName(d));
MSDN says
The EnumerateFiles and GetFiles methods differ as follows: When you
use EnumerateFiles, you can start enumerating the collection of names
before the whole collection is returned; when you use GetFiles, you
must wait for the whole array of names to be returned before you can
access the array. Therefore, when you are working with many files and
directories, EnumerateFiles can be more efficient.
EDIT
Following your comments below the choice of EnumerateFiles is not possible (available from NET 4.0) and if you want to keep the full path name available for other tasks but show just the filename in the listbox then you need to keep it in some kind of collection (an array or better a list)
using System.IO;
...
string sourcePath = #"c:\Users\JohnDoe\Desktop\my_files";
List<string> dirs = Directory.GetFiles(sourcePath, "*.sde")
.Select (d => Path.GetFileName(d)
.ToList());
this.GetSdePath.Items.AddRange(dirs.ToArray());
;
Make List<string>dirs a form level variable if you need its content outside the Form_Load event
I have a snippet of code that will traverse a directory location and create a data model from it. For example, if I have a directory structure:
c:\TestDir1
c:\TestDir1\Sub1\
c:\TestDir1\Sub1\File1.txt
c:\TestDir1\Sub1\File2.txt
c:\TestDir1\Sub1\SubSub1
c:\TestDir1\Sub1\SubSub1\File3.xlsx
c:\TestDir1\Sub1\SubSub1\SubDirX
c:\TestDir1\Sub1\SubSub1\SubDirX\File4.txt
c:\TestDir1\Sub1\SubSub1\SubDirX\File5.txt
c:\TestDir1\Sub1\SubSub1\SubDirX\File6.txt
It will create the appropriate data model via the following code:
static void BeginIt()
{
DirectoryInfo diTop = new DirectoryInfo(#"c:\Misc\3) Projects\002) Document Manager\DirectoryReading\TestDir1");
string path = diTop.FullName;
MySubDir mySubDir = new MySubDir(path);
}
public class MySubDir
{
public ArrayList _dirs;
public ArrayList _files;
public MySubDir(string dirPath)
{
_dirs = new ArrayList();
_files = new ArrayList();
this.ProcessDirectory(dirPath);
}
private void ProcessDirectory(string dirPath)
{
// Process the list of files found in the directory.
string[] fileEntries = Directory.GetFiles(dirPath);
foreach (string fileName in fileEntries)
{
_files.Add(fileName);
}
// Recurse into subdirectories of this directory.
string[] subdirectoryEntries = Directory.GetDirectories(dirPath);
foreach (string subdirectory in subdirectoryEntries)
{
_dirs.Add(new MySubDir(subdirectory));
}
}
}
Here's my question. When I step through the code line by line it is building up the data model appropriately. When I do an AddWatch I can see the object and the directory structure is built up properly.
When I try to access the value of the contents via the Immediate Window I get errors. For example if I type the following into the immediate window the following
? mySubDir._dirs[0]._dirs[0]
I get an error.
How do I get at the values of these subdirectories? I would like to be able to access the directory names and filenames of the elements in this model now that it is created.
Thanks
That doesn't look like it would work, since the expression mySubDir.whatever depends on mySubDir being in scope and having a valid value. In order for that to happen, the constructor has to return first -- but the object is being populated during the execution of the constructor. So there's really no point during the lifetime of this program that such an expression would yield a meaningful result.
If you break into the debugger inside the ProcessDirectory method, you can use this._dirs to have a look into the data structure.
Apart from that, ArrayList is not the best choice for a collection that you know from beforehand will contain just strings, like the ones you have here. It would be more appropriate to define those as System.Collections.Generic.List<string>.
well.. _dirs and files are arraylist.. so you might want to transverse that list and get all the values. a for, an enumerator, a linq or whatever method you like will do the trick..
Update:
After reading some more your post, I think there is a problem of basic understand. Adding just names to the class, will not give you the file position or it folder. You will have to look for a better way to use it (maybe a class folder/files that can hold folders also?)..
_dirs is an ArrayList which stores objects so you need to cast the object from the first _dir[0] to a MySubDir
e.g.
((MySubDir)mySubDir._dir[0])._dir[0]
Either that or change the collection type from ArrayList to
List<MySubDir>
this will give you strongly typed list items when accessed with the indexer.