Iterate through files on a certain level - c#

I'm trying to iterate through all the files on a certain level in the folder hierachy, more specifically, in all the sub-sub folders. Before I do actual operations on the files, I also want to count all the files to be able to show a progress bar. This means the iterating method must be called 2 times. This is the relevant code, I'm using now:
Iterate(bool count)
{
foreach (string dir in Directory.GetDirectories(root))
foreach (string subdir in Directory.GetDirectories(dir))
foreach (string file in Directory.GetFiles(subdir))
{
if (count) progressBar.Maximum++;
else
{
//do operations
}
}
}
I'm wondering if there's a better way of doing this. Surely there must be a better way than adding a foreach for every folder level..?

It'd be easier to me to use LINQ here.
var files =
(from dir in Directory.GetDirectories(root)
from subdir in Directory.GetDirectories(dir)
from f in Directory.GetFiles(subdir)
select f).ToList();
var fileCount = files.Length;
foreach (var f in files) {
...
}

Before your GetFiles foreach, try this:
string [] fileEntries = Directory.GetFiles(subdir);
int intFileCount = fileEntries.length;
Or it can replace it, if the loop only serves to count the files.

The documentation of Directory.GetFiles shows how to recursively iterate a directory tree

you could download the fluent thing I wrote for System.IO (see here: http://blog.staticvoid.co.nz/2011/11/staticvoid-io-extentions-nuget.html) and then use this LINQ statement
var files = from d in di.Directories()
from dir in d.Directories()
from f in dir.Files()
select f;

write this instead your code
string [] files = Directory.GetFile("yourDirectory","*.*",SearchOptions.AllDirectories);
this will return all files in sub-directories instead of using recursion

Related

C#: Renaming files on the result of `Directory.EnumerateFiles` causes infinite loop

I have a folder contains a lot of files(>100000),
and I want to write some code to perform a batch renaming.
I use Directory.EnumerateFiles rather than Directory.GetFiles since the former does not allocate the space for all the files name so I don't need to be worried about the memory.
Here is the code:
var files = Directory.EnumerateFiles(#"C:\many_files");
foreach (var file in files)
{
// Use`File.Move` to do the renaming stuff
}
However I encountered an infinite loop.
That's because EnumerateFiles is executing along with the renaming action. So after renaming a file, the file might be reordered to the end of the "enumeration list" and EnumerateFiles will return it again.
So Is there a solution? Or I have to use the GetFiles which may cause OOM.
The easiest way to do this is to load all the filePaths in memory and rename them.
Since you're worried about memory, You can do this:
Provided there wouldn't be any additions to the directory during this process.
static void Rename(string path, int batchSize = 5)
{
var queryable = Directory.EnumerateFiles(path).AsQueryable(); //Queryable For Getting Files
var totalFiles = queryable.Count(); //Get Total File Count in Path
int count = 0;//counter
while (count <= totalFiles)
{
var filesToProcess = queryable
.Where(f => !Path.GetFileName(f).StartsWith("reviewed_")) //Get Files Without this prefix
.Skip(count)
.Take(batchSize)
.ToList();//Get Batch To Process
foreach (var file in filesToProcess)
{
//Do Renaming Stuff
Console.WriteLine(file);
}
count += batchSize;
}
}

Check for name of a directory from an array

I'm working on a program that is supposed to scan a specific directory looking for any directories within it that have specific names, and if it finds them, tell the user.
Currently, the way I am loading the names its searching for is like this:
static string path = Path.Combine(Directory.GetCurrentDirectory(), #"database.txt");
static string[] database = File.ReadAllLines(datapath);
I am using this as an array of names to look for when looking through a specific directory. I am doing so with a foreach method.
System.IO.DirectoryInfo di = new DirectoryInfo("C:\ExampleDirectory");
foreach (DirectoryInfo dir in di.GetDirectories())
{
}
Is there a way to see if any of the names in the file "database.txt" match any names of directories found within "C:\ExampleDirectory"?
The only way I can think of doing this is:
System.IO.DirectoryInfo di = new DirectoryInfo(versionspath);
foreach (DirectoryInfo dir in di.GetDirectories())
{
if(dir.Name == //Something...) {
Console.WriteLine("Match found!");
break;}
}
But this obviously won't work, and I cannot think of any other way to do this. Any help would be greatly appreciated!
Based on your other questions on stackoverflow, I presume your question is a homework or you are a passionate hobby programmer, am I right? So I'll try to explain the principle here continuing your almost complete solution.
You will need a nested loop here, a loop in a loop. In the outer loop you iterate through the directories. You already got this one. For each directory you need to loop through the names in database to see if any item in it matches the name of the directory:
System.IO.DirectoryInfo di = new DirectoryInfo(versionspath);
foreach (DirectoryInfo dir in di.GetDirectories())
{
foreach (string name in database)
{
if (dir.Name == name)
{
Console.WriteLine("Match found!");
break;
}
}
}
Depending on your goal, you might want to exit at the first matching directory. The sample code above doesn't. The single break; statement only exits the inner loop, not the outer one. So it continues to check the next directory. Try to figure it out yourself how to stop at the first match (by exiting the outer loop).
As usual, LINQ is the way to go. Whenever you have to find matches or not-matches between two lists and both lists containing different types, you'll have to use .Join() or .GroupJoin().
The .Join() comes into play, if you need to find a 1:1 relationship and the .GroupJoin() for any kind of 1-to relationship (1:0, 1:many or also 1:1).
So, if you need the directories that match your list, this sounds for a job to the .Join() operator:
public static void Main(string[] args)
{
// Where ever this comes normally from.
string[] database = new[] { "fOo", "bAr" };
string startDirectory = #"D:\baseFolder";
// A method that returns an IEnumerable<string>
// Using maybe a recursive approach to get all directories and/or files
var candidates = LoadCandidates(startDirectory);
var matches = database.Join(
candidates,
// Simply pick the database entry as is.
dbEntry => dbEntry,
// Only take the last portion of the given path.
fullPath => Path.GetFileName(fullPath),
// Return only the full path from the given matching pair.
(dbEntry, fullPath) => fullPath,
// Ignore case on comparison.
StringComparer.OrdinalIgnoreCase);
foreach (var match in matches)
{
// Shows "D:\baseFolder\foo"
Console.WriteLine(match);
}
Console.ReadKey();
}
private static IEnumerable<string> LoadCandidates(string baseFolder)
{
return new[] { #"D:\baseFolder\foo", #"D:\basefolder\baz" };
//return Directory.EnumerateDirectories(baseFolder, "*", SearchOption.AllDirectories);
}
You can use LINQ to do this
var allDirectoryNames = di.GetDirectories().Select(d => d.Name);
var matches = allDirectoryNames.Intersect(database);
if (matches.Any())
Console.WriteLine("Matches found!");
In the first line we get all the directory names, then we use the Intersect() method to see which ones are present in both allDirectoryNames and database

Finding all files in a folder using enumeration

I'm trying to list out all files under a given directory by taking sub directories as well into account.I'm using yield so that I could club this with Take where I call this (note that I'm using .NET 3.5).
Below is my code:
IEnumerable<string> Search(string sDir)
{
foreach (var file in Directory.GetFiles(sDir))
{
yield return file;
}
foreach (var directory in Directory.GetDirectories(sDir))
{
Search(directory);
}
}
I don't know what is going wrong here, but it only returns one file (which is the one under the root directory, and there is only one there as well). Can you please help?
You need to yield the results of the recursive search, otherwise you're just throwing its results away:
IEnumerable<string> Search(string sDir)
{
foreach (var file in Directory.GetFiles(sDir))
{
yield return file;
}
foreach (var directory in Directory.GetDirectories(sDir))
{
foreach(var file in Search(directory))
yield return file;
}
}
Note that if your intent is to simply get a flat list of every file, consider using Directory.GetFiles instead with the option to search all subdirectories. If your intent is to leverage LINQ (or other methods) to apply searching criteria or a limit to the total number of files retrieved, then this is a decent way to go as you'll read directories one at a time and stop once you've fulfilled your criterion.

Counting Files Extensions C#

Hi i'm a c# begginer and i'd like to do a simple program which is going to go through a folder and count how many files are .mp3 files and how many are .flac .
Like I said the program is very basic. It will ask for the music folder path and will then go through it. I know there will be a lot of subfolders in that main music folder so it will have to open them one at the time and go through them too.
E.g
C:/Music/
will be the given directory.
But it doesn't contain any music in itself.
To get to the music files the program would have to open subfolders like
C:/Music/Electronic/deadmau5/RandomAlbumTitle/
Only then he can count the .mp3 files and .flac files and store them in two separated counters.
The program will have to do that for at least 2000 folders.
Do you know a good way or method to go through files and return its name (and extension)?
You can use System.IO.DirectoryInfo. DirectoryInfo provides a GetFiles method, which also has a recursive option, so if you're not worried about speed, you can do this:
DirectoryInfo di = new DirectoryInfo(#"C:\Music");
int numMP3 = di.GetFiles("*.mp3", SearchOption.AllDirectories).Length;
int numFLAC = di.GetFiles("*.flac", SearchOption.AllDirectories).Length;
Use DirectoryInfo and a grouping by the file extension:
var di = new DirectoryInfo(#"C:/Music/");
var extensionCounts = di.EnumerateFiles("*.*", SearchOption.AllDirectories)
.GroupBy(x => x.Extension)
.Select(g => new { Extension = g.Key, Count = g.Count() })
.ToList();
foreach (var group in extensionCounts)
{
Console.WriteLine("There are {0} files with extension {1}", group.Count,
group.Extension);
}
C# has a built in method of searching for files in all sub-directories. Make sure you add a using statement for System.IO
var path = "C:/Music/"
var files = Directory.GetFiles(path, "*.mp3", SearchOption.AllDirectories);
var count = files.Length;
Since you're a beginner you should hold off on the more flexible LINQ method until later.
int fileCount = Directory.GetFiles(_Path, "*.*", SearchOption.TopDirectoryOnly).Length
Duplicate question How to read File names recursively from subfolder using LINQ
Jon Skeet answered there with
You don't need to use LINQ to do this - it's built into the framework:
string[] files = Directory.GetFiles(directory, "*.dll",
SearchOption.AllDirectories);
or if you're using .NET 4:
IEnumerable<string> files = Directory.EnumerateFiles(directory, "*.dll",
SearchOption.AllDirectories);
To be honest, LINQ isn't great in terms of recursion. You'd probably want to write your own general-purpose recursive extension method. Given how often this sort of question is asked, I should really do that myself some time...
Here is MSDN support page, How to recursively search directories by Visual C#
Taken directly from that page:
void DirSearch(string sDir)
{
try
{
foreach (string d in Directory.GetDirectories(sDir))
{
foreach (string f in Directory.GetFiles(d, txtFile.Text))
{
lstFilesFound.Items.Add(f);
}
DirSearch(d);
}
}
catch (System.Exception excpt)
{
Console.WriteLine(excpt.Message);
}
}
You can use this code in addition to creating FileInfo objects. Once you have the file info objects you can check the Extension property to see if it matches the ones you care about.
MSDN has lots of information and examples, for example how you can iterate through a directory: http://msdn.microsoft.com/en-us/library/bb513869.aspx

Using Directory.GetFiles with regex-like filter

I have a folder with two files:
Awesome.File.20091031_123002.txt
Awesome.File.Summary.20091031_123152.txt
Additionally, a third-party app handles the files as follows:
Reads a folderPath and a searchPattern out of a database
Executes Directory.GetFiles(folderPath, searchPattern), processing whatever files match the filter in bulk, then moving the files to an archive folder.
It turns out that I have to move my two files into different archive folders, so I need to handle them separately by providing different searchPatterns to select them individually. Please note that I can't modify the third-party app, but I can modify the searchPattern and file destinations in my database.
What searchPattern will allow me to select Awesome.File.20091031_123002.txt without including Awesome.File.Summary.20091031_123152.txt?
If your were going to use LINQ then...
var regexTest = new Func<string, bool>(i => Regex.IsMatch(i, #"Awesome.File.(Summary)?.[\d]+_[\d]+.txt", RegexOptions.Compiled | RegexOptions.IgnoreCase));
var files = Directory.GetFiles(#"c:\path\to\folder").Where(regexTest);
Awesome.File.????????_??????.txt
The question mark (?) acts as a single character place holder.
I wanted to try my meager linq skills here... I'm sure there is a more elegant solution, but here's mine:
string pattern = ".SUMMARY.";
string[] awesomeFiles = System.IO.Directory.GetFiles("path\\to\\awesomefiles");
IEnumerable<string> sum_files = from file in awesomeFiles
where file.ToUpper().Contains(pattern)
select file;
IEnumerable<string> other_files = from file in awesomeFiles
where !file.ToUpper().Contains(pattern)
select file;
This assumes there aren't any other files in the directory other than the two, but you can adjust the pattern here to suit your needs (i.e. add "Awesome.File" to the pattern start.)
When you iterate the collection of each, you should get what you need.
According to the documentation, searchPattern only supports the ***** and ? wildcards. You would need to write your own regex filter that takes the results of Directory.GetFiles and applies further filtering logic.
If you don't want to use Linq, here's one way.
public void FileChecker(string filePath)
{
DirectoryInfo di = new DirectoryInfo(filePath);
int _MatchCounter;
string RegexPattern = "^[a-zA-Z_a-zA-Z_a-zA-Z_0-9_0-9_0-9.csv]*$";
Regex RegexPatternMatch = new Regex(RegexPattern, RegexOptions.IgnoreCase);
foreach (FileInfo matchingFile in di.GetFiles())
{
Match m = RegexPatternMatch.Match(matchingFile.Name);
if ((m.Success))
{
MessageBox.Show(matchingFile.Name);
_MatchCounter += 1;
}
}
}

Categories

Resources