DirectoryInfo.GetFiles with multiple filters - c#

I am trying to get a list of FileInfo objects that satisfy multiple filters.
Every suggestion I have seen uses array of file names/paths instead of FileInfo:
var files = Directory.GetFiles(sLogPath, "*.*", SearchOption.TopDirectoryOnly)
.Where(s => s.StartsWith("abc", StringComparison.CurrentCultureIgnoreCase) || s.StartsWith("def", StringComparison.CurrentCultureIgnoreCase));
What I am trying to get is:
DirectoryInfo di = new DirectoryInfo(sLogPath);
var files = di.GetFiles(<same filter as above>);
But it looks like I can only do something like:
var files = di.GetFiles("*_" + dateStr + ".log");

Based on your comment to me on your question, it looks like you want to filter on file names, but get the FileInfos that correspond to these names.
You can do something like this
var di = new DirectoryInfo(sLogPath);
var files = di
.GetFiles("*.*", SearchOption.TopDirectoryOnly)
.Where(x => x.Name.StartsWith("abc", StringComparison.CurrentCultureIgnoreCase)
|| x.Name.StartsWith("def", StringComparison.CurrentCultureIgnoreCase))
.ToList();
We're using the Name property in the filter and working with the FileInfo[] array returned by DirectoryInfo.GetFiles().

Related

How to filter files with date from a folder in C#

I have c# code to get the pdf files from a folder.
string[] pdfFiles = Directory.GetFiles(EnvSettingsTools.FilePath.ToString(), "*.pdf")
.Select(path => Path.GetFileNameWithoutExtension(path))
.ToArray();
How can i use this code to get the files by filter by date. The file format will be
LondoPage 20160301.pdf
I need to filter the files with the date in the end of filename. i.e, if i pass date '20160301' the mentioned file should select.
Try
var date = "20160301";
string[] pdfFiles = Directory.GetFiles(EnvSettingsTools.FilePath.ToString(), "*.pdf")
.Select(path => Path.GetFileNameWithoutExtension(path))
.Where(f => f.EndsWith(date))
.ToArray();
This will get files whose name ends with date you pass for filter:
var date = "20160301";
string[] pdfFiles = Directory.GetFiles(EnvSettingsTools.FilePath.ToString(), "*.pdf")
.Select(Path.GetFileNameWithoutExtension)
.Where(f => f.EndsWith(date))
.ToArray();
I did not quite understand your desired outcome but I will do an educated guess that you need to be able to order or filter the folder content. That can be achieved in more than one way. I will propose a solution to your problem
if (!Directory.Exists(pathToFiles)) return;
DirectoryInfo di = new DirectoryInfo(pathToFiles);
FileSystemInfo[] files = di.GetFileSystemInfos();
var orderedFiles = files.OrderBy(f => f.CreationTime);
var filteredFileByDate = orderedFiles.Where(f => f.FullName.ToLowerInvariant().Split('/').Last().Contains("filterText"));
With the ordered files you will receive all files ordered by a particular property, in this case I have chosen the creation date.
With the filtered file you will receive a collection of one or more files that mach your criteria. The more precise is your criteria the less results you get.
Hope that it helps ;)

GetDirectories - Find the directory that does not match the pattern

I have 140 directories that I'm trying to process. According to my tests there are 139 directories that match my file pattern (*abc.txt).
I'm trying to find the 1 directory to verify that in fact it does not have a *abc.txt in it.
How can I do this?
The following code gives me the 140 directories number:
var directoryCount = from subdirectory in Directory.GetDirectories(paramStartFilePath, "*", SearchOption.AllDirectories)
where Directory.GetDirectories(subdirectory).Length == 0
select subdirectory;
I'm gathering the files based off the pattern like this:
dirInfoFiles= new DirectoryInfo(startFilePath);
IEnumerable<FileInfo> listFiles = dirInfoFiles.EnumerateFiles("*abc.txt, System.IO.SearchOption.AllDirectories);
How can I find the the one directory that doesn't contain my .txt file?
There is always the running the tank through the village approach: just enumerate *.* and then exclude the patterns that don't match.
If you want all directories that does not contain at least one txt-file which name ends with "abc":
IEnumerable<DirectoryInfo> matchingDirs = dirInfoFiles.EnumerateDirectories("*.*", System.IO.SearchOption.AllDirectories)
.Where(d => !d.EnumerateFiles().Any(f => f.Extension.ToUpper() == ".TXT"
&& f.Name.EndsWith("abc", StringComparison.OrdinalIgnoreCase)));
or the same in other words, possibly more readable:
IEnumerable<DirectoryInfo> matchingDirs = dirInfoFiles
.EnumerateDirectories("*.*", System.IO.SearchOption.AllDirectories)
.Where(d => !d.EnumerateFiles("*abc.txt").Any());
Here is my take. It returns the first item (or null) that contains a file ending with the text you are looking for and is case insensitive. You could remove the lambdas to make it more readable.
var directory = Directory.GetDirectories((paramStartFilePath, "*", SearchOption.AllDirectories)
.FirstOrDefault(x => new DirectoryInfo(x).EnumerateFiles().Any(f => !f.Name.EndsWith("abc.txt",true,CultureInfo.CurrentCulture)));

How to exclude folders when using Directory.GetDirectories

I want to return a list of all the subdirectories in the 'SomeFolder' directory excluding the 'Admin' and 'Templates' directories.
I have the following folder structure (simplified):
C:\inetpub\wwwroot\MyWebsite\SomeFolder\RandomString
C:\inetpub\wwwroot\MyWebsite\SomeFolder\RandomString
C:\inetpub\wwwroot\MyWebsite\SomeFolder\RandomString
C:\inetpub\wwwroot\MyWebsite\SomeFolder\Admin
C:\inetpub\wwwroot\MyWebsite\SomeFolder\Templates
'SomeFolder' can contain a varying number a 'RandomString' folders (anywhere from ~10 to ~100).
Here is what I have tried:
var dirs = Directory.GetDirectories(Server.MapPath(".."))
.Where(s => !s.EndsWith("Admin") || !s.EndsWith("Templates"));
foreach (string dir in dirs)
{
lit.Text += Environment.NewLine + dir;
}
This returns the full list of folders (shown above) without 'Admin' and 'Templates' filtered out.
Interestingly, if I change the LINQ .Where clause to include, instead of exclude, 'Admin' and 'Templates' it works, meaning it returns just the paths for 'Admin' and 'Templates'.
.Where(s => s.EndsWith("Admin") || s.EndsWith("Templates"));
If LINQ is not the solution, is there any way to use the GetDirectories SearchPattern to filter out directories?
You can do something like:
//list your excluded dirs
private List<string> _excludedDirectories= new List<string>() { "Admin", "Templates" };
//method to check
static bool isExcluded(List<string> exludedDirList, string target)
{
return exludedDirList.Any(d => new DirectoryInfo(target).Name.Equals(d));
}
//then use this
var filteredDirs = Directory.GetDirectories(path).Where(d => !isExcluded(_excludedDirectories, d));
the opposite of (A || B) is (!A && !B), so in your code it should be &&, not ||...

Creating a list of docs that contains same name

I'm creating a tool that is supposed to concatenate docs that contain the same name.
example: C_BA_20000_1.pdf and C_BA_20000_2.pdf
These files should be grouped in one list.
That tool runs on a directory lets say
//directory of pdf files
DirectoryInfo dirInfo = new DirectoryInfo(#"C:\Users\derp\Desktop");
FileInfo[] fileInfos = dirInfo.GetFiles("*.pdf");
foreach (FileInfo info in fileInfos)
I want to create an ArrayList that contains filenames of the same name
ArrayList list = new ArrayList();
list.Add(info.FullName);
and then have a list that contains all the ArrayLists of similar docs.
List<ArrayList> bigList = new List<ArrayList>();
So my question, how can I group files that contains same name and put them in the same list.
EDIT:
Files have the same pattern in their names AB_CDEFG_i
where i is a number and can be from 1-n. Files with the same name should have only different number at the end.
AB_CDEFG_1
AB_CDEFG_2
HI_JKLM_1
Output should be:
List 1: AB_CDEFG_1 and AB_CDEFG_2
List 2: HI_JKLM_1
Create method which extracts 'same' part of file name. E.g.
public string GetRawName(string fileName)
{
int index = fileName.LastIndexOf("_");
return fileName.Substring(0, index);
}
And use this method for grouping:
var bigList = Directory.EnumerateFiles(#"C:\Users\derp\Desktop", "*.pdf")
.GroupBy(file => GetRawName(file))
.Select(g => g.ToList())
.ToList();
This will return List<List<string>> (without ArrayList).
UPDATE Here is regular expression, which will work with all kind of files, whether they have number at the end, or not
public string GetRawName(string file)
{
string name = Path.GetFileNameWithoutExtension(file);
return Regex.Replace(name, #"(_\d+)?$", "")
}
Grouping:
var bigList = Directory.EnumerateFiles(#"C:\Users\derp\Desktop", "*.pdf")
.GroupBy(GetRawName)
.Select(g => g.ToList())
.ToList();
It sounds like the difficulty is in deciding which files are the same.
static string KeyFromFileName(string file)
{
// Convert from "C_BA_20000_2" to "C_BA_20000"
return file.Substring(0, file.LastIndexOf("_"));
// Note: This assumes there is an _ in the filename.
}
Then you can use this LINQ to build a list of fileSets.
using System.Linq; // Near top of file
var files = Directory.GetFiles(#"C:\Users\derp\Desktop", "*.pdf")
var fileSets = files
.Select(file => file.FullName)
.GroupBy(KeyFromFileName)
.Select(g => new {g.Key, Files = g.ToList()}
.ToList();
Aside from the fact that your question doesnt identify what "same name" means. This is a typical solution.
fileInfos.GroupBy ( f => f.FullName )
.Select( grp => grp.ToList() ).ToList();
This will get you a list of lists... also won't throw an exception if a file doesn't contain the underscore, etc.
private string GetKey(FileInfo fi)
{
var index = fi.Name.LastIndexOf('_');
return index == -1 ? Path.GetFileNameWithoutExtension(fi.Name)
: fi.Name.Substring(0, index);
}
var bigList = fileInfos.GroupBy(GetKey)
.Select(x => x.ToList())
.ToList();

Using the linq search option in enumerate files

Quick one here. I am trying to EnumerateFiles in a C# application and I want to find all the files in a directory that do not match a given pattern. So I would have something like this:
var files = Directory.EnumerateFiles("MY_DIR_PATH", "NOT_MY_FILE_NAME");
Can someone help me out with the not part?
I don't think you can use that overload of EnumerateFiles for this, but you can use linq:
Directory.EnumerateFiles("MY_DIR_PATH").Where(s => s != "NOT_MY_FILE_NAME");
or in query syntax:
var files = from f in Directory.EnumerateFiles("MY_DIR_PATH")
where f != "NOT_MY_FILE_NAME"
select f;
You can do something like that:
var files = Directory.EnumerateFiles("MY_DIR_PATH")
.Where(fileName => fileName != "MY_FILE_NAME");
How about
var files = Directory.GetFiles("MY_DIR_PATH")
.Where(f => !f.Contains("NOT_MY_FILE_NAME"));

Categories

Resources