Parallel folder size scanning. WaitAll() hangs - c#

I think I get that this is hanging because I am not awaiting the async call from Main causing a deadlock but I cant make Main async so how to fix this? I will first show the program I am trying to parallelize then I will show my attempt at parallelization. I think it is obvious I am trying to get the fastest possible program to check the size of a list of folders (from multiple shares). If it is possible to parallelize at a higher level and write the outputs to CSV out of order that is fine but I was satisfied with processing one share at a time. I have tried several derivations of the parallel code. This is just my latest so it is possible it is more wrong than my earlier attempts. Just know that this was not my only attempt. I am currently knee deep in researching parallelization in c# and will probably be able to figure this out at some point but if you can offer insights that would be greatly appreciated
namespace ShareSize
{
class Program
{
static long Size { get; set; }
static void Main(string[] args)
{
using (StreamReader sr = new StreamReader(args[0]))
{
while (!sr.EndOfStream)
{
share = sr.ReadLine().Trim(',');
Console.WriteLine(share);
string[] root = Directory.GetDirectories(share);
MeasureFolders(root);
MeasureFiles(Directory.GetFiles(share));
Console.WriteLine("SIZE = " + Size);
using (StreamWriter sw = new StreamWriter(args[1], true))
{
sw.WriteLine(share + "," + Size / 1073741824);
}
Size = 0;
}
}
Console.ReadLine();
}
private static MeasureFolders(string[] root)
{
MeasureFolder(root);
}
private static MeasureFolder(string[] directories)
{
foreach (string d in directories)
{
try
{
Console.WriteLine($"Measure Folder {d}");
string[] files = Directory.GetFiles(d);
string[] subDirectories = Directory.GetDirectories(d);
if (files.Length != 0)
MeasureFiles(files);
if (subDirectories.Length != 0)
MeasureFolder(subDirectories);
}
catch
{
;
}
}
}
private static void MeasureFiles(string[] files)
{
foreach (var f in files)
{
Size += new FileInfo(f).Length;
}
}
}
}
And here is my attempt at parallelization.
namespace ShareSize
{
class Program
{
static long Size { get; set; }
static List<Task> Tasks = new List<Task>();
private static Object Lock = new Object();
static void Main(string[] args)
{
string share = "";
using (StreamReader sr = new StreamReader(args[0]))
{
while (!sr.EndOfStream)
{
share = sr.ReadLine().Trim(',');
string[] root = Directory.GetDirectories(share);
MeasureFolders(root).ConfigureAwait(false);
MeasureFiles(Directory.GetFiles(share));
using (StreamWriter sw = new StreamWriter(args[1], true))
{
sw.WriteLine(share + "," + Size / 1073741824);
}
Size = 0;
}
}
Console.ReadLine();
}
private static async Task MeasureFolders(string[] root)
{
await MeasureFolder(root).ConfigureAwait(false);
await Task.WhenAll(Tasks.ToArray());
}
private static async Task MeasureFolder(string[] directories)
{
foreach (string d in directories)
{
try
{
string[] files = Directory.GetFiles(d);
string[] subDirectories = Directory.GetDirectories(d);
if (files.Length != 0)
{
Task newTask = new Task(delegate { MeasureFiles(files); });
newTask.Start();
Tasks.Add(newTask);
}
if (subDirectories.Length != 0)
await MeasureFolder(subDirectories);
}
catch
{
;
}
}
}
private static void MeasureFiles(string[] files)
{
foreach (var f in files)
{
lock (Lock)
{
try
{
Size += new FileInfo(f).Length;
}
catch
{
;
}
}
}
}
}
}
Thank you very much.

Maybe I am missing the point, but the above seems a bit over-complicated. The below code snippets get the directory size of any given path. I wrote the code in such a way that comparing the serialized code to the parallelized code is easier. But one of the biggest things to consider: If you are going to collect data in parallel, you will likely need to allocate memory ahead of time (array), or lock the object to ensure no concurrent access (lock() { }). Both are demonstrated below.
Notes:
This code can be optimized further
Getting the size on disk would require a few changes
This uses built-in C# Parallel.For, Parallel.Foreach, and lock() {} syntax
SerialFunctions
public long GetDirectorySizesBytes(string root) {
long dirsize = 0;
string[] directories = Directory.GetDirectories(root);
string[] files = Directory.GetFiles(root);
if (files != null) {
dirsize += GetFileSizesBytes(files);
}
foreach(var dir in directories) {
var size = GetDirectorySizesBytes(dir);
dirsize += size;
}
return dirsize;
}
public long GetFileSizesBytes(string[] files) {
long[] fileSizes = new long[files.Length];
for(int i = 0; i < files.Length; i++) {
fileSizes[i] = new FileInfo(files[i]).Length;
}
return fileSizes.Sum();
}
Parallelized Functions
public long ParallelGetDirectorySizesBytes(string root) {
long dirsize = 0;
string[] directories = Directory.GetDirectories(root);
string[] files = Directory.GetFiles(root);
if (files != null) {
dirsize += ParallelGetFileSizesBytes(files);
}
Parallel.ForEach(directories, dir => {
var size = ParallelGetDirectorySizesBytes(dir);
lock (lockObject) { //static lockObject defined at top of class
dirsize += size;
}
});
return dirsize;
}
public long ParallelGetFileSizesBytes(string[] files) {
long[] fileSizes = new long[files.Length];
Parallel.For(0, files.Length, i => {
fileSizes[i] = new FileInfo(files[i]).Length;
});
return fileSizes.Sum();
}
Test function
[TestMethod]
public void GetDirectoriesSizesTest() {
var actual = GetDirectorySizesBytes(#"C:\Exchanges");
var parallelActual = ParallelGetDirectorySizesBytes(#"C:\Exchanges");
long expected = 25769767281;
Assert.AreEqual(expected, actual);
Assert.AreEqual(expected, parallelActual);
}
Complete Class
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
namespace StackOverflowProjects.Tests {
[TestClass]
public class DirectorySizeTests {
public static object lockObject = new object();
[TestMethod]
public void GetDirectoriesSizesTest() {
var actual = GetDirectorySizesBytes(#"C:\Exchanges");
var parallelActual = ParallelGetDirectorySizesBytes(#"C:\Exchanges");
long expected = 25769767281;
Assert.AreEqual(expected, actual);
Assert.AreEqual(expected, parallelActual);
}
public long GetDirectorySizesBytes(string root) {
long dirsize = 0;
string[] directories = Directory.GetDirectories(root);
string[] files = Directory.GetFiles(root);
if (files != null) {
dirsize += GetFileSizesBytes(files);
}
foreach(var dir in directories) {
var size = GetDirectorySizesBytes(dir);
dirsize += size;
}
return dirsize;
}
public long GetFileSizesBytes(string[] files) {
long[] fileSizes = new long[files.Length];
for(int i = 0; i < files.Length; i++) {
fileSizes[i] = new FileInfo(files[i]).Length;
}
return fileSizes.Sum();
}
public long ParallelGetDirectorySizesBytes(string root) {
long dirsize = 0;
string[] directories = Directory.GetDirectories(root);
string[] files = Directory.GetFiles(root);
if (files != null) {
dirsize += ParallelGetFileSizesBytes(files);
}
Parallel.ForEach(directories, dir => {
var size = ParallelGetDirectorySizesBytes(dir);
lock (lockObject) {
dirsize += size;
}
});
return dirsize;
}
public long ParallelGetFileSizesBytes(string[] files) {
long[] fileSizes = new long[files.Length];
Parallel.For(0, files.Length, i => {
fileSizes[i] = new FileInfo(files[i]).Length;
});
return fileSizes.Sum();
}
}
}

Related

Prevent ' Process is terminated due to StackOverflowException' in C#

I have a program which builds a very large tree from input data and traverses it, both by recursion. I have tested the program on smaller inputs (and thus smaller trees) and it functions as intended. However when the input data is much larger i run into 'Process is terminated due to StackOverflowException'. I assume this is due to the stack running out of space. Is there any way to prevent this or do I have to switch to building the tree via iteration instead? Or perhaps I am missing a case of infinite recursion somewhere?
Here is the code:
class Program
{
static int[] tileColors;
static Color[] colors;
static int totalTiles;
static void Main(string[] args)
{
Stopwatch s = new Stopwatch();
s.Start();
string[] data = File.ReadAllLines("colors.txt");
totalTiles = int.Parse(data[0].Split(' ')[0]);
int totalColors = int.Parse(data[0].Split(' ')[1]);
string[] colorsRaw = data[1].Split(' ');
tileColors = new int[totalTiles];
for (int i = 0; i < totalTiles; i++)
{
tileColors[i] = int.Parse(colorsRaw[i]) - 1;
}
colors = new Color[totalColors];
for (int i = 3; i < data.Length; i++)
{
string[] raw = data[i].Split(' ');
int[] pair = new int[] { int.Parse(raw[0]) - 1, int.Parse(raw[1]) - 1 };
if (colors[pair[0]] == null)
colors[pair[0]] = new Color(pair[1]);
else
colors[pair[0]].pairs.Add(pair[1]);
if (colors[pair[1]] == null)
colors[pair[1]] = new Color(pair[0]);
else
colors[pair[1]].pairs.Add(pair[0]);
}
Tree t = new Tree();
t.root = new Node(0);
PopulateTree(t.root);
long ans = t.CountMatchingLeaves(t.root, totalTiles - 1) % 1000000007;
Console.WriteLine(ans);
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
}
static void PopulateTree(Node root)
{
for (int i = root.tile + 1; i < totalTiles; i++)
{
if (colors[tileColors[i]] == null) continue;
if (colors[tileColors[i]].Compatible(tileColors[root.tile]))
{
var node = new Node(i);
root.children.Add(node);
PopulateTree(node);
}
}
}
}
class Color
{
public List<int> pairs = new List<int>();
public Color(int pair)
{
pairs.Add(pair);
}
public bool Compatible(int c)
{
return pairs.Contains(c);
}
}
class Node
{
public List<Node> children = new List<Node>();
public int tile;
public Node(int tile)
{
this.tile = tile;
}
}
class Tree
{
public Node root;
public List<Node> GetMatchingLeaves(Node root, int match)
{
if (root.children.Count == 0)
{
if (root.tile == match)
{
return new List<Node>() { root };
}
return new List<Node>();
}
List<Node> list = new List<Node>();
foreach(var c in root.children)
{
list.AddRange(GetMatchingLeaves(c, match));
}
return list;
}
public long CountMatchingLeaves(Node root, int match)
{
if (root.children.Count == 0)
{
if (root.tile == match)
{
return 1;
}
return 0;
}
long count = 0;
foreach (var c in root.children)
{
count += CountMatchingLeaves(c, match);
}
return count;
}
}
You can always rewrite recursion as iteration, usually by using a stack class rather than rely on your thread's stack. For your code it would look like this:
static void PopulateTree(Node start)
{
var nodes = new Stack<Node>();
nodes.Push(start);
while(nodes.Count != 0)
{
var root = nodes.Pop();
for (int i = root.tile + 1; i < totalTiles; i++)
{
if (colors[tileColors[i]] == null) continue;
if (colors[tileColors[i]].Compatible(tileColors[root.tile]))
{
var node = new Node(i);
root.children.Add(node);
nodes.Push(node);
}
}
}
}
The while loop checking for more items is the equivalent of your terminating condition in recursion.

Connect WriteTo Stream and ReadFrom Stream methods

I have two methods from two different third party libraries:
Task WriteToAsync(Stream stream);
Task LoadAsync(Stream stream);
I need to pipe data from source WriteTo method to Load method.
Currently next solution is used:
using (var stream = new MemoryStream()) {
await source.WriteToAsync(stream);
stream.Position = 0;
await destination.LoadAsync(stream);
}
Is there any better way?
As the code below demonstrates, you can use pipe streams to stream data from one to the other, and you should not use await on the writer until after you have started the reader.
class Program
{
static void Main(string[] args)
{
ReaderDemo rd = new ReaderDemo();
GenPrimes(rd).ContinueWith((t) => {
if (t.IsFaulted)
Console.WriteLine(t.Exception.ToString());
else
Console.WriteLine(rd.value);
}).Wait();
}
static async Task GenPrimes(ReaderDemo rd)
{
using (var pout = new System.IO.Pipes.AnonymousPipeServerStream(System.IO.Pipes.PipeDirection.Out))
using (var pin = new System.IO.Pipes.AnonymousPipeClientStream(System.IO.Pipes.PipeDirection.In, pout.ClientSafePipeHandle))
{
var writeTask = WriterDemo.WriteTo(pout);
await rd.LoadFrom(pin);
await writeTask;
}
}
}
class ReaderDemo
{
public string value;
public Task LoadFrom(System.IO.Stream input)
{
return Task.Run(() =>
{
using (var r = new System.IO.StreamReader(input))
{
value = r.ReadToEnd();
}
});
}
}
class WriterDemo
{
public static Task WriteTo(System.IO.Stream output)
{
return Task.Run(() => {
using (var writer = new System.IO.StreamWriter(output))
{
writer.WriteLine("2");
for (int i = 3; i < 10000; i+=2)
{
int sqrt = ((int)Math.Sqrt(i)) + 1;
int factor;
for (factor = 3; factor <= sqrt; factor++)
{
if (i % factor == 0)
break;
}
if (factor > sqrt)
{
writer.WriteLine("{0}", i);
}
}
}
});
}
}

How can i return from recursive method and then continure the recursive method from the last returned point?

I have this two methods
private List<DirectoryInfo> GetDirectories(string basePath)
{
IEnumerable<string> str = MyGetDirectories(basePath);
List<DirectoryInfo> l = new List<DirectoryInfo>();
l.Add(new DirectoryInfo(basePath));
IEnumerable<DirectoryInfo> dirs = str.Select(a => new DirectoryInfo(a));
l.AddRange(dirs);
return l;
}
And
static int countDirectories = 0;
private IEnumerable<string> MyGetDirectories(string basePath)
{
try
{
string[] dirs = Directory.GetDirectories(basePath);
if (dirs.Length > 0)
return dirs.Union(dirs);
countDirectories = countDirectories + dirs.Length;
_FileInformationWorker.ReportProgress(countDirectories,dirs);
return dirs.Union(dirs.SelectMany(dir => MyGetDirectories(dir)));
}
catch (UnauthorizedAccessException)
{
return Enumerable.Empty<string>();
}
}
And in a backgroundworker dowork this
private void _FileInformationWorker_DoWork(object sender, DoWorkEventArgs e)
{
MySubDirectories = GetDirectories(BasePath).ToArray();
}
Instead waiting for the method MyGetDirectories to finish I want to update in the dowork event the variable MySubDirectories each time the variable dirs. change. In this case first time dirs Length is 36 so I make return and I see that MySubDirectories contain 36 items. The problem now is that the recursive in the method MyGetdirectories won't continue. I want it to continue so next time dirs length is above 0 for example next time it will be 3 then update MySubDirectories so now MySubDirectories will contain the new 3 items that are in dirs.
I don't want to stop the recursive I just want to keep updating the MySubDirectories in real time.
What I tried so far and did is in the method MyGetdirectories i'm also reporting the dirs variable:
_FileInformationWorker.ReportProgress(countDirectories, dirs);
Then in the progresschanged event did:
List<string[]> testing = new List<string[]>();
private void _FileInformationWorker_ProgressChanged(object sender, ProgressChangedEventArgs e)
{
label2.Text = e.ProgressPercentage.ToString();
string[] test = (string[])e.UserState;
if (test.Length > 0)
testing.Add(test);
}
But now i'm ending with a List of string[] arrays. And test is string[]
Is there any way to cast/convert the string[] to DirectoryInfo[] ?
From https://www.microsoft.com/en-us/download/details.aspx?id=14782
// Hand-off through a BufferBlock<T>
private static BufferBlock<int> m_buffer = new BufferBlock<int>();
// Producer
private static void Producer()
{
while(true)
{
int item = Produce();
m_buffer.Post(item);
}
}
// Consumer
private static async Task Consumer()
{
while(true)
{
int item = await m_buffer.ReceiveAsync();
Process(item);
}
}
// Main
public static void Main()
{
var p = Task.Factory.StartNew(Producer);
var c = Consumer();
Task.WaitAll(p,c);
}
This should work:
test.Select(p => new DirectoryInfo(p)).ToArray();

In C# Lambda expression, how to determine a matched value?

In the following function (which works perfectly), I've been presented now with the challenge of having it not only return where matches were found, but what the match was... the code:
txtFilePattern is a pipe separated list of file extensions.
txtKeywords is a multiline textbox for keywords I'm looking for
txtPatterns is same as txtKeywords, but for regex patterns.
This is my own little experiment into C# Grep.
private List<Tuple<String, Int32, String>> ScanDocuments2()
{
Regex searchPattern = new Regex(#"$(?<=\.(" + txtFilePattern.Text + "))", RegexOptions.IgnoreCase);
string[] keywordtext = txtKeywords.Lines;
List<string> keywords = new List<string>();
List<Regex> patterns = new List<Regex>();
for (int i = 0; i < keywordtext.Length; i++)
{
if (keywordtext[i].Length > 0)
{
keywords.Add(keywordtext[i]);
}
}
string[] patterntext = txtPatterns.Lines;
for (int j = 0; j < patterntext.Length; j++)
{
if (patterntext[j].Length > 0)
{
patterns.Add(new Regex(patterntext[j]));
}
}
try
{
var files = Directory.EnumerateFiles(txtSelectedDirectory.Text, "*.*", SearchOption.AllDirectories).Where(f => searchPattern.IsMatch(f));
//fileCount = files.Count();
var lines = files.Aggregate(
new List<Tuple<String, Int32, String>>(),
(accumulator, file) =>
{
fileCount++;
using (var reader = new StreamReader(file))
{
var counter = 0;
String line;
while ((line = reader.ReadLine()) != null)
{
if (keywords.Any(keyword => line.ToLower().Contains(keyword.ToLower())) || patterns.Any(pattern => pattern.IsMatch(line)))
{
//cleans up the file path for grid
string tmpfile = file.Replace(txtSelectedDirectory.Text, "..");
accumulator.Add(Tuple.Create(tmpfile, counter, line));
}
counter++;
}
}
return accumulator;
},
accumulator => accumulator
);
return lines;
}
catch (UnauthorizedAccessException UAEx)
{
Console.WriteLine(UAEx.Message);
throw UAEx;
}
catch (PathTooLongException PathEx)
{
Console.WriteLine(PathEx.Message);
throw PathEx;
}
}
The question is - how can I determine pass which keyword or pattern matched to the Tuple I'm returning?
Here's some refactored code. Kenneth had the right idea.
private IEnumerable<LineMatch> ScanDocuments2()
{
string[] keywordtext = txtKeywords.Lines;
string[] patterntext = txtPatterns.Lines;
Regex searchPattern = GetSearchPattern();
var keywords = GetKeywords(keywordtext).ToList();
var patterns = GetPatterns(patterntext).ToList();
try
{
var files = GetFiles(searchPattern);
var lines = files.Aggregate(
new List<LineMatch>(),
(accumulator, file) =>
{
foreach(var item in EnumerateFile(file, keywords, patterns))
{
accumulator.Add(item);
}
return accumulator;
},
accumulator => accumulator
);
return lines;
}
catch (UnauthorizedAccessException UAEx)
{
Console.WriteLine(UAEx.Message);
throw;
}
catch (PathTooLongException PathEx)
{
Console.WriteLine(PathEx.Message);
throw;
}
}
private LineMatch EnumerateFile(string file, IEnumerable<string> keywords, IEnumerable<Regex> patterns)
{
var counter = 0;
foreach(var line in File.ReadLines(file))
{
var matchingRegex = patterns.FirstOrDefault(p => p.IsMatch(line));
var keyword = keywords.FirstOrDefault(k => line.ToLower().Contains(k.ToLower()));
if(keyword == null && matchingRegex == null) continue;
string tmpfile = file.Replace(txtSelectedDirectory.Text, "..");
yield return new LineMatch
{
Counter = counter,
File = tmpfile,
Line = line,
Pattern = matchingRegex == null ? null : matchingRegex.Pattern,
Keyword = keyword
};
counter++;
}
}
private IEnumerable<string> GetFiles(Regex searchPattern)
{
return Directory.EnumerateFiles(txtSelectedDirectory.Text, "*.*", SearchOption.AllDirectories).Where(f => searchPattern.IsMatch(f));
}
private IEnumerable<string> GetKeywords(IEnumerable<string> keywordtext)
{
foreach(var keyword in keywordtext)
{
if(keyword.Length <= 0) continue;
yield return keyword;
}
}
private IEnumerable<string> GetPatterns(IEnumerable<string> patterntext)
{
foreach(var pattern in patterntext)
{
if(pattern.Length <= 0) continue;
yield return new Regex(pattern);
}
}
private Regex GetSearchPattern()
{
return new Regex(string.Format(#"$(?<=\.({0}))", txtFilePattern.Text), RegexOptions.IgnoreCase);
}
public class LineMatch
{
public int Counter { get; set; }
public string File { get; set; }
public string Line { get; set; }
public string Pattern { get; set; }
public string Keyword { get; set; }
}
How about you introduce a new variable to hold the matching pattern, and you use FirstOrDefault instead of Any. Then, so long as the new variable is not null you have the pattern that matched, and you can return it within your Tuple.
e.g.
...
new List<Tuple<String, Int32, String, Regex>>()
...
while ((line = reader.ReadLine()) != null)
{
Regex matchingReg = patterns.FirstOrDefault(pattern => pattern.IsMatch(line));
if (keywords.Any(keyword => line.ToLower().Contains(keyword.ToLower())) || matchingReg != null)
{
//cleans up the file path for grid
string tmpfile = file.Replace(txtSelectedDirectory.Text, "..");
accumulator.Add(Tuple.Create(tmpfile, counter, line, matchingReg));
}
counter++;
}
...

C# File and Directory iteration, possible to do both at one?

This might be a confusing question but I have written below a Directory crawler, that will start at a root crawler, find all unique directories and then find all files and count them and add up their file size. However, the way I have it written requires going to the directory twice, one to find the directories and the next time to count the files. If/how is it possible to get all the information once?
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
HashSet<string> DirectoryHolding = new HashSet<string>();
DirectoryHolding.Add(rootDirectory);
#region All Directory Region
int DirectoryCount = 0;
int DirectoryHop = 0;
bool FindAllDirectoriesbool = true;
while (FindAllDirectoriesbool == true)
{
string[] DirectoryHolder = Directory.GetDirectories(rootDirectory);
if (DirectoryHolder.Length == 0)
{
if (DirectoryHop >= DirectoryHolding.Count())
{
FindAllDirectoriesbool = false;
}
else
{
rootDirectory = DirectoryHolding.ElementAt(DirectoryHop);
}
DirectoryHop++;
}
else
{
foreach (string DH in DirectoryHolder)
{
DirectoryHolding.Add(DH);
}
if (DirectoryHop > DirectoryHolding.Count())
{
FindAllDirectoriesbool = false;
}
rootDirectory = DirectoryHolding.ElementAt(DirectoryHop);
DirectoryHop++;
}
}
DirectoryCount = DirectoryHop - 2;
#endregion
#region File Count and Size Region
int FileCount = 0;
long FileSize = 0;
for (int i = 0; i < DirectoryHolding.Count ; i++)
{
string[] DirectoryInfo = Directory.GetFiles(DirectoryHolding.ElementAt(i));
for (int fi = 0; fi < DirectoryInfo.Length; fi++)
{
try
{
FileInfo fInfo = new FileInfo(DirectoryInfo[fi]);
FileCount++;
FileSize = FileSize + fInfo.Length;
}
catch (Exception ex)
{
Console.WriteLine(ex.Message.ToString());
}
}
}
The stopwatch result for this is 1.38
int FileCount = 0;
long FileSize = 0;
for (int i = 0; i < DirectoryHolding.Count; i++)
{
var entries = new DirectoryInfo(DirectoryHolding.ElementAt(i)).EnumerateFileSystemInfos();
foreach (var entry in entries)
{
if ((entry.Attributes & FileAttributes.Directory) == FileAttributes.Directory)
{
DirectoryHolding.Add(entry.FullName);
}
else
{
FileCount++;
FileSize = FileSize + new FileInfo(entry.FullName).Length;
}
}
}
the stop watch for this method is 2.01,
this makes no sense to me.
DirectoryInfo Dinfo = new DirectoryInfo(rootDirectory);
DirectoryInfo[] directories = Dinfo.GetDirectories("*.*", SearchOption.AllDirectories);
FileInfo[] finfo = Dinfo.GetFiles("*.*", SearchOption.AllDirectories);
foreach (FileInfo f in finfo)
{
FileSize = FileSize + f.Length;
}
FileCount = finfo.Length;
DirectoryCount = directories.Length;
.26 seconds i think this is the winner
You can use Directory.EnumerateFileSystemEntries():
var entries = Directory.EnumerateFileSystemEntries(rootDirectory);
foreach (var entry in entries)
{
if(File.Exists(entry))
{
//file
}
else
{
//directory
}
}
Or alternatively DirectoryInfo.EnumerateFileSystemInfos() (this might be more performant since FileSystemInfo already has most of the info you need and you can skip the File.Exists check):
var entries = new DirectoryInfo(rootDirectory).EnumerateFileSystemInfos();
foreach (var entry in entries)
{
if ((entry.Attributes & FileAttributes.Directory) == FileAttributes.Directory)
{
//direcotry
}
else
{
//file
}
}
The usual approach is to write a recursive method. Here it is in pseudocode:
void ProcessDirectory(Dir directory)
{
foreach (var file in directory.Files)
ProcessFile(file);
foreach (var child in directory.Subdirectories)
ProcessDirectory(directory);
}
You can also reverse the order of the foreach loops. For example, to calculate the total size of all files with a recursive method, you could do this:
int GetTotalFileSize(Dir directory)
{
ulong result = 0UL;
foreach (var child in directory.Subdirectories)
result += GetTotalFileSize(directory);
foreach (var file in directory.Files)
result += file.Length;
return result;
}

Categories

Resources