Find First File in a Branching Directory - c#

I'm trying to find the first .dcm in a directory tree then get the first full path (a/a/a/123.dcm) . However ignoring directories where the ie .dcm is not found.
example:
a/a/a/123.dcm
a/a/a/1234.dcm
a/a/a/12345.dcm
a/a/b/23.dcm
a/a/b/234.dcm
a/a/b/2345.dcm
a/a/c/23.dcm
a/a/c/234.dcm
a/a/c/2345.dcm
Answer should be: a/a/a/123.dcm, a/a/b/23.dcm and a/a/c/23.dcm
I tried:
var files = Directory.GetFiles(inputDir, "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".dcm")).ToArray();
var dir = Directory.GetDirectories(inputDir, "*.*", SearchOption.AllDirectories).ToArray();
var biggest = files.First();
foreach (var item in dir)
{
DirectoryInfo di = new DirectoryInfo(item);
var q = from i in di.GetFiles("*.dcm", SearchOption.AllDirectories)
select i.Name;
var qq = q.First();
foreach (var items in qq)
{
Console.WriteLine(items);
}
}
However what I get is the answer for five directories. Answer:
a/a/a/123.dcm
a/a/a/123.dcm
a/a/a/123.dcm
a/a/b/23.dcm
a/a/c/23.dcm
I’m just wondering if there’s a simpler way to do this using LINQ or something else? Thank you so much for your help. Cheers.

Here's a LINQ version:
var inputDir = #"c:\\temp";
var files = Directory
.EnumerateFiles(inputDir, "*.dcm", SearchOption.AllDirectories)
.Select(f => new FileInfo(f))
.GroupBy(f => f.Directory.FullName, d => d, (d, f) => new { Directory = d, FirstFile = f.ToList().First() })
.ToList();
files.ForEach(f => Console.WriteLine("{0} {1}", f.Directory, f.FirstFile));

Related

Looking for folders in multiple extension and multiple string format

Hi I am trying to get all the files with a set of extension and a set of string format
string extensions=".exe,.txt,.xls";
string fileFormat"fileA, fileB, fileC";
let says if i have the following files in the folder
fileA20200805.txt
fileBxxxx.exe
FileCCCCCCC.txt
FileD123.xls
the result should only return the first 3 files which is
fileA20200805.txt
fileBxxxx.exe
FileCCCCCCC.txt
because FileD123.xls is not in the fileFormat.
I have tried the following code:
Directoryinfo dInfo = new DirectoryInfo(path);
FileInfo[] files = dInfoGetFiles()
.Where(f => extensions.Contains(f.Extension.ToLower()) && fileFormat.Any(f.Name.Contains))
.ToArray();
However, I am still getting all 4 files, the FileD123.xls is still returning
Maybe
var extensions = new [] {".exe",".txt",".xls"};
var fileFormat = new [] {"fileA", "fileB", "fileC"};
...
.Where(f =>
extensions.Contains(f.Extension.ToLower()) &&
fileFormat.Any(x => f.Name.StartsWith(x, StringComparison.OrdinalIgnoreCase)))
You could also use regex i guess
var regex = new Regex(#$"({string.Join("|", fileFormat)}[^.]*({string.Join(" | ", extensions)})", RegexOptions.Compiled|RegexOptions.IgnoreCase);
...
.Where(f => regex.IsMatch(f.Name))
I think this should work.
string[] extensions = new string[] { ".exe",".txt",".xls" };
string[] fileFormat = new string[] { "fileA", "fileB", "fileC" };
DirectoryInfo dInfo = new DirectoryInfo(path);
FileInfo[] files = dInfo.GetFiles();
var output = files.Where(f => extensions.Contains(f.Extension.ToLower()) &&
fileFormat.Any(f.Name.Contains)).ToArray();
it return 2 because FileCCCCCCC dont equals fileC.

Find multiple files in the same directory

I'm trying to find, giving a path, a list of files that have same filename but different extensions (.bak and .dwg) in the same directory.
I have this code:
String[] FileNames = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories).Where(s => s.EndsWith(".bak") || s.EndsWith(".dwg")).ToArray();
var queryDupNames = from f in FileNames
group f by Path.GetFileNameWithoutExtension(f) into g
where g.Count() > 1
select new { Name = g.Key, FileNames = g };
This works great to locate files with the same filename but in the whole system. I need only to obtain those that are in the same directory.
For example:
- Dir1\filename1.bak
- Dir1\filename1.dwg
- Dir1\filename2.bak
- Dir1\filename2.dwg
- Dir1\filename3.dwg
- DiferentDir\filename1.bak
- DiferentDir\filename1.dwg
- DiferentDir\filename3.dwg
The result should be:
- Dir1\filename1.bak
- Dir1\filename1.dwg
- Dir1\filename2.bak
- Dir1\filename2.dwg
- DiferentDir\filename1.bak
- DiferentDir\filename1.dwg
But with my code, filename3 is also included due to
g.count() > 1
it's true. It's grouping by only filename... I tried to fix with this code but I got 0 results:
String[] FileNames = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories).Where(s => s.EndsWith(".bak") || s.EndsWith(".dwg")).ToArray();
var queryDupNames = from f in FileNames
group f by new { path = Path.GetLongPath(f), filen = Path.GetFileNameWithoutExtension(f) } into g
where g.Count() > 1
select new { Name = g.Key, FileNames = g };
Any help or clue?
Thanks
System.IO.Path doesn't have a GetLongPath method. I suspect you are using an external library like AlphaFS. In any case, GetLongPath returns the full file path, not the path of the file's folder.
The file's folder path is returned by GetDirectoryName both in System.IO and other libraries like AlphaFS. The following snippet will return only Dir1\filename1, Dir1\filename2 and DifferentDir\filename1
var files = new[]
{
#"c:\Dir1\filename1.bak",
#"c:\Dir1\filename1.dwg",
#"c:\Dir1\filename2.bak",
#"c:\Dir1\filename2.dwg",
#"c:\Dir1\filename3.dwg",
#"c:\DiferentDir\filename1.bak",
#"c:\DiferentDir\filename1.dwg",
#"c:\DiferentDir\filename3.dwg",
};
var duplicates = from file in files
group file by new
{
Folder = Path.GetDirectoryName(file),
Name = Path.GetFileNameWithoutExtension(file)
} into g
where g.Count()>1
select new
{
Name = g.Key,
Files = g.ToArray()
};
first find all folders.
then for each folder find all the files with same name but different extension.
something like this:
var list = new List<string>();
foreach (var subDirectory in Directory.EnumerateDirectories(#"C:\Temp"))
{
var files = Directory.EnumerateFiles(subDirectory);
var repeated = files.Select(Path.GetFileNameWithoutExtension)
.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(y => y.Key);
list.AddRange(repeated);
}
tested on .net 4.6

filter files by creation date using c#

with the following code , I can merge text files which exist in a directory
var allLines = Directory.GetFiles(directory, "*.txt")
.SelectMany(f => File.ReadLines(f));
File.WriteAllLines(outputFileName, allLines);
How can I modify it to merge files which date creation is today ?
DirectoryInfo info = new DirectoryInfo("");
FileInfo[] files = info.GetFiles().OrderBy(p => p.CreationTime).ToArray();
How can I combine the piece of code ?
DirectoryInfo info = new DirectoryInfo(directory);
var allLines = info.GetFiles("*.txt")
.Where(p => p.CreationTime.Date == DateTime.Today)
.OrderBy(p => p.CreationTime)
.SelectMany(p => File.ReadAllLines(p.FullName));
File.WriteAllLines(outputFileName, allLines);
Use File.GetCreationTime method:
var allLines = Directory.EnumerateFiles(directory, "*.txt", SearchOption.TopDirectoryOnly)
.Where(path => File.GetCreationTime(path).Date == DateTime.Today)
.SelectMany(f => File.ReadLines(f));
Try this:
var allLines = Directory.GetFiles(directory, "*.txt")
.Where(x => new FileInfo(x).CreationTime.Date == DateTime.Today.Date)
.SelectMany(f => File.ReadLines(f));

recursively scan all the directories under the root directory and find only the newest file from each folder

that's what i try but it return only the newest file from only the top directories under the root
if(Directory.Exists("YourPath"))
foreach (string _tempFiles in Directory.GetDirectories("YourPath")
.Select(directory => Directory.GetFiles(directory, "*.*", SearchOption.AllDirectories)
.OrderByDescending(File.GetLastWriteTime)
.FirstOrDefault()))
This returns all newest files of each directory(including root):
var rootDirFile = Directory
.EnumerateFiles(yourPath, "*.*", SearchOption.TopDirectoryOnly)
.OrderByDescending(f => File.GetLastWriteTime(f))
.Take(1);
var allNewestFilesOfEachFolder = Directory
.EnumerateDirectories(yourParth, "*.*", SearchOption.AllDirectories)
.Select(d => Directory.EnumerateFiles(d, "*.*")
.OrderByDescending(f => File.GetLastWriteTime(f))
.FirstOrDefault());
// put both together, the root-file first
allNewestFilesOfEachFolder = rootDirFile.Concat(allNewestFilesOfEachFolder);
If there's no file in a directory the file is null, so the number of files is equal to the number of folders.
Note that Linq is not the right tool for System.IO since error-handling is difficult.
I wrote a basic recursive function to handle this:
// Dictionary:
// Key = The directory name.
// Value = The most recently modified file for that directory.
public static Dictionary<string, string> GetNewestFiles(string directory)
{
return GetNewestFiles(directory, null);
}
static Dictionary<string, string> GetNewestFiles(string directory,
Dictionary<string, string> dictionary)
{
if(dictionary == null)
dictionary = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
try
{
var files = from file in Directory.GetFiles(directory)
select new FileInfo(file);
var latestFile = files.OrderByDescending(file => { return file.LastWriteTimeUtc; }).FirstOrDefault();
if (latestFile != null)
dictionary[latestFile.DirectoryName] = latestFile.FullName;
}
catch { }
foreach (var subDirectory in Directory.GetDirectories(directory))
{
try
{
GetNewestFiles(subDirectory, dictionary);
}
catch { }
}
return dictionary;
}
So then you can just call it like so:
var fileDictionary = GetNewestFiles(#"C:\MyFolder");

Directory.GetFiles get today's files only

There is nice function in .NET Directory.GetFiles, it's simple to use it when I need to get all files from directory.
Directory.GetFiles("c:\\Files")
But how (what pattern) can I use to get only files that created time have today if there are a lot of files with different created time?
Thanks!
For performance, especially if the directory search is likely to be large, the use of Directory.EnumerateFiles(), which lazily enumerates over the search path, is preferable to Directory.GetFiles(), which eagerly enumerates over the search path, collecting all matches before filtering any:
DateTime today = DateTime.Now.Date ;
FileInfo[] todaysFiles = new DirectoryInfo(#"c:\foo\bar")
.EnumerateFiles()
.Select( x => {
x.Refresh();
return x;
})
.Where( x => x.CreationTime.Date == today || x.LastWriteTime == today )
.ToArray()
;
Note that the the properties of FileSystemInfo and its subtypes can be (and are) cached, so they do not necessarily reflect current reality on the ground. Hence, the call to Refresh() to ensure the data is correct.
Try this:
var todayFiles = Directory.GetFiles("path_to_directory")
.Where(x => new FileInfo(x).CreationTime.Date == DateTime.Today.Date);
You need to get the directoryinfo for the file
public List<String> getTodaysFiles(String folderPath)
{
List<String> todaysFiles = new List<String>();
foreach (String file in Directory.GetFiles(folderPath))
{
DirectoryInfo di = new DirectoryInfo(file);
if (di.CreationTime.ToShortDateString().Equals(DateTime.Now.ToShortDateString()))
todaysFiles.Add(file);
}
return todaysFiles;
}
You could use this code:
var directory = new DirectoryInfo("C:\\MyDirectory");
var myFile = (from f in directory.GetFiles()
orderby f.LastWriteTime descending
select f).First();
// or...
var myFile = directory.GetFiles()
.OrderByDescending(f => f.LastWriteTime)
.First();
see here: How to find the most recent file in a directory using .NET, and without looping?
using System.Linq;
DirectoryInfo info = new DirectoryInfo("");
FileInfo[] files = info.GetFiles().OrderBy(p => p.CreationTime).ToArray();
foreach (FileInfo file in files)
{
// DO Something...
}
if you wanted to break it down to a specific date you could try this using a filter
var files = from c in directoryInfo.GetFiles()
where c.CreationTime >dateFilter
select c;
You should be able to get through this:
var loc = new DirectoryInfo("C:\\");
var fileList = loc.GetFiles().Where(x => x.CreationTime.ToString("dd/MM/yyyy") == currentDate);
foreach (FileInfo fileItem in fileList)
{
//Process the file
}
var directory = new DirectoryInfo(Path.GetDirectoryName(#"--DIR Path--"));
DateTime from_date = DateTime.Now.AddDays(-5);
DateTime to_date = DateTime.Now.AddDays(5);
//For Today
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file.CreationTime.Date == DateTime.Now.Date ).ToArray();
//For date range + specific file extension
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file => file.CreationTime.Date >= from_date.Date && file.CreationTime.Date <= to_date.Date && file.Extension == ".txt").ToArray();
//To get ReadOnly files from directory
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file => file.IsReadOnly == true).ToArray();
//To get files based on it's size
int fileSizeInKB = 100;
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file => (file.Length)/1024 > fileSizeInKB).ToArray();

Categories

Resources