Find multiple files in the same directory - c#

I'm trying to find, giving a path, a list of files that have same filename but different extensions (.bak and .dwg) in the same directory.
I have this code:
String[] FileNames = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories).Where(s => s.EndsWith(".bak") || s.EndsWith(".dwg")).ToArray();
var queryDupNames = from f in FileNames
group f by Path.GetFileNameWithoutExtension(f) into g
where g.Count() > 1
select new { Name = g.Key, FileNames = g };
This works great to locate files with the same filename but in the whole system. I need only to obtain those that are in the same directory.
For example:
- Dir1\filename1.bak
- Dir1\filename1.dwg
- Dir1\filename2.bak
- Dir1\filename2.dwg
- Dir1\filename3.dwg
- DiferentDir\filename1.bak
- DiferentDir\filename1.dwg
- DiferentDir\filename3.dwg
The result should be:
- Dir1\filename1.bak
- Dir1\filename1.dwg
- Dir1\filename2.bak
- Dir1\filename2.dwg
- DiferentDir\filename1.bak
- DiferentDir\filename1.dwg
But with my code, filename3 is also included due to
g.count() > 1
it's true. It's grouping by only filename... I tried to fix with this code but I got 0 results:
String[] FileNames = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories).Where(s => s.EndsWith(".bak") || s.EndsWith(".dwg")).ToArray();
var queryDupNames = from f in FileNames
group f by new { path = Path.GetLongPath(f), filen = Path.GetFileNameWithoutExtension(f) } into g
where g.Count() > 1
select new { Name = g.Key, FileNames = g };
Any help or clue?
Thanks

System.IO.Path doesn't have a GetLongPath method. I suspect you are using an external library like AlphaFS. In any case, GetLongPath returns the full file path, not the path of the file's folder.
The file's folder path is returned by GetDirectoryName both in System.IO and other libraries like AlphaFS. The following snippet will return only Dir1\filename1, Dir1\filename2 and DifferentDir\filename1
var files = new[]
{
#"c:\Dir1\filename1.bak",
#"c:\Dir1\filename1.dwg",
#"c:\Dir1\filename2.bak",
#"c:\Dir1\filename2.dwg",
#"c:\Dir1\filename3.dwg",
#"c:\DiferentDir\filename1.bak",
#"c:\DiferentDir\filename1.dwg",
#"c:\DiferentDir\filename3.dwg",
};
var duplicates = from file in files
group file by new
{
Folder = Path.GetDirectoryName(file),
Name = Path.GetFileNameWithoutExtension(file)
} into g
where g.Count()>1
select new
{
Name = g.Key,
Files = g.ToArray()
};

first find all folders.
then for each folder find all the files with same name but different extension.
something like this:
var list = new List<string>();
foreach (var subDirectory in Directory.EnumerateDirectories(#"C:\Temp"))
{
var files = Directory.EnumerateFiles(subDirectory);
var repeated = files.Select(Path.GetFileNameWithoutExtension)
.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(y => y.Key);
list.AddRange(repeated);
}
tested on .net 4.6

Related

Looking for folders in multiple extension and multiple string format

Hi I am trying to get all the files with a set of extension and a set of string format
string extensions=".exe,.txt,.xls";
string fileFormat"fileA, fileB, fileC";
let says if i have the following files in the folder
fileA20200805.txt
fileBxxxx.exe
FileCCCCCCC.txt
FileD123.xls
the result should only return the first 3 files which is
fileA20200805.txt
fileBxxxx.exe
FileCCCCCCC.txt
because FileD123.xls is not in the fileFormat.
I have tried the following code:
Directoryinfo dInfo = new DirectoryInfo(path);
FileInfo[] files = dInfoGetFiles()
.Where(f => extensions.Contains(f.Extension.ToLower()) && fileFormat.Any(f.Name.Contains))
.ToArray();
However, I am still getting all 4 files, the FileD123.xls is still returning
Maybe
var extensions = new [] {".exe",".txt",".xls"};
var fileFormat = new [] {"fileA", "fileB", "fileC"};
...
.Where(f =>
extensions.Contains(f.Extension.ToLower()) &&
fileFormat.Any(x => f.Name.StartsWith(x, StringComparison.OrdinalIgnoreCase)))
You could also use regex i guess
var regex = new Regex(#$"({string.Join("|", fileFormat)}[^.]*({string.Join(" | ", extensions)})", RegexOptions.Compiled|RegexOptions.IgnoreCase);
...
.Where(f => regex.IsMatch(f.Name))
I think this should work.
string[] extensions = new string[] { ".exe",".txt",".xls" };
string[] fileFormat = new string[] { "fileA", "fileB", "fileC" };
DirectoryInfo dInfo = new DirectoryInfo(path);
FileInfo[] files = dInfo.GetFiles();
var output = files.Where(f => extensions.Contains(f.Extension.ToLower()) &&
fileFormat.Any(f.Name.Contains)).ToArray();
it return 2 because FileCCCCCCC dont equals fileC.

get files in order of directory path

I am trying to pick files from directory below is the format of files
14094901-1_SCAN_f568aecd-5f5a-424d-bb54-b2a7ee60ca9e
14094901-2_SCAN_90b3ddf3-17f9-417d-b64d-61a175a779a3
but when file size reach to 10 like 10 after picking files first it pics no 1 file then jumps to 10.i amusing below code don't know why it is doing do
string path1 = #"C:\Users\test\AppData\Local\Temp\XXXXX";
var paths = Directory.GetFiles(path1)
.OrderBy(path =>
Convert.ToInt32(
String.Concat(
path.Split('-', '.')
.Skip(3)
.Take(1)
//.Select(num => num.PadLeft(2, '0'))
.ToArray())
)
);
Please let me know how can i get files in proper order 1,2,3,4,5,6,7,8,9,10
but getting 1,10,2,3,4,5,6,7,8,9
This might help
string path1 = #"C:\Users\test\AppData\Local\Temp\XXXXX"
var files = Directory.GetFiles(path1);
var fileIndex = files.Select(a => new {Name = a, Index = Convert.ToInt32(a.Split(new[] {'-', '_'})[1])});
var orderdFileNames = fileIndex.OrderBy(a => a.Index).Select(a => a.Name);
Convert the second split value to int before .ToArry().
Please try this
string path1 = #"C:\Users\test\AppData\Local\Temp\XXXXX";
var files = Directory.GetFiles(path1);
var orderedFiles = files.OrderBy(file => Convert.ToInt32(file.Split(new []{'-', '_'})[1]));
Please try this
var orderedFiles = Directory.GetFiles(path1).OrderBy(path =>
Convert.ToInt32(
String.Concat(
path.Split('_','-')
.Skip(1).Take(1)
.ToArray())
)
);

Find First File in a Branching Directory

I'm trying to find the first .dcm in a directory tree then get the first full path (a/a/a/123.dcm) . However ignoring directories where the ie .dcm is not found.
example:
a/a/a/123.dcm
a/a/a/1234.dcm
a/a/a/12345.dcm
a/a/b/23.dcm
a/a/b/234.dcm
a/a/b/2345.dcm
a/a/c/23.dcm
a/a/c/234.dcm
a/a/c/2345.dcm
Answer should be: a/a/a/123.dcm, a/a/b/23.dcm and a/a/c/23.dcm
I tried:
var files = Directory.GetFiles(inputDir, "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".dcm")).ToArray();
var dir = Directory.GetDirectories(inputDir, "*.*", SearchOption.AllDirectories).ToArray();
var biggest = files.First();
foreach (var item in dir)
{
DirectoryInfo di = new DirectoryInfo(item);
var q = from i in di.GetFiles("*.dcm", SearchOption.AllDirectories)
select i.Name;
var qq = q.First();
foreach (var items in qq)
{
Console.WriteLine(items);
}
}
However what I get is the answer for five directories. Answer:
a/a/a/123.dcm
a/a/a/123.dcm
a/a/a/123.dcm
a/a/b/23.dcm
a/a/c/23.dcm
I’m just wondering if there’s a simpler way to do this using LINQ or something else? Thank you so much for your help. Cheers.
Here's a LINQ version:
var inputDir = #"c:\\temp";
var files = Directory
.EnumerateFiles(inputDir, "*.dcm", SearchOption.AllDirectories)
.Select(f => new FileInfo(f))
.GroupBy(f => f.Directory.FullName, d => d, (d, f) => new { Directory = d, FirstFile = f.ToList().First() })
.ToList();
files.ForEach(f => Console.WriteLine("{0} {1}", f.Directory, f.FirstFile));

Find new file in two folders with a cross check

I am trying to sort two folders in to a patched folder, finding which file is new in the new folder and marking it as new, so i can transfer that file only. i dont care about dates or hash changes. just what file is in the new folder that is not in the old folder.
somehow the line
pf.NFile = !( oldPatch.FindAll(s => s.Equals(f)).Count() == 0);
is always returning false. is there something wrong with my logic of cross checking?
List<string> newPatch = DirectorySearch(_newFolder);
List<string> oldPatch = DirectorySearch(_oldFolder);
foreach (string f in newPatch)
{
string filename = Path.GetFileName(f);
string Dir = (Path.GetDirectoryName(f).Replace(_newFolder, "") + #"\");
PatchFile pf = new PatchFile();
pf.Dir = Dir;
pf.FName = filename;
pf.NFile = !( oldPatch.FindAll(s => s.Equals(f)).Count() == 0);
nPatch.Files.Add(pf);
}
foreach (string f in oldPatch)
{
string filename = Path.GetFileName(f);
string Dir = (Path.GetDirectoryName(f).Replace(_oldFolder, "") + #"\");
PatchFile pf = new PatchFile();
pf.Dir = Dir;
pf.FName = filename;
if (!nPatch.Files.Exists(item => item.Dir == pf.Dir &&
item.FName == pf.FName))
{
nPatch.removeFiles.Add(pf);
}
}
I don't have the classes you are using (like DirectorySearch and PatchFile), so i can't compile your code, but IMO the line _oldPatch.FindAll(... doesn't return anything because you are comparing the full path (c:\oldpatch\filea.txt is not c:\newpatch\filea.txt) and not the file name only. IMO your algorithm could be simplified, something like this pseudocode (using List.Contains instead of List.FindAll):
var _newFolder = "d:\\temp\\xml\\b";
var _oldFolder = "d:\\temp\\xml\\a";
List<FileInfo> missing = new List<FileInfo>();
List<FileInfo> nPatch = new List<FileInfo>();
List<FileInfo> newPatch = new DirectoryInfo(_newFolder).GetFiles().ToList();
List<FileInfo> oldPatch = new DirectoryInfo(_oldFolder).GetFiles().ToList();
// take all files in new patch
foreach (var f in newPatch)
{
nPatch.Add(f);
}
// search for hits in old patch
foreach (var f in oldPatch)
{
if (!nPatch.Select (p => p.Name.ToLower()).Contains(f.Name.ToLower()))
{
missing.Add(f);
}
}
// new files are in missing
One possible solution with less code would be to select the file names, put them into a list an use the predefined List.Except or if needed List.Intersect methods. This way a solution to which file is in A but not in B could be solved fast like this:
var locationA = "d:\\temp\\xml\\a";
var locationB = "d:\\temp\\xml\\b";
// takes file names from A and B and put them into lists
var filesInA = new DirectoryInfo(locationA).GetFiles().Select (n => n.Name).ToList();
var filesInB = new DirectoryInfo(locationB).GetFiles().Select (n => n.Name).ToList();
// Except retrieves all files that are in A but not in B
foreach (var file in filesInA.Except(filesInB).ToList())
{
Console.WriteLine(file);
}
I have 1.xml, 2.xml, 3.xml in A and 1.xml, 3.xml in B. The output is 2.xml - missing in B.

Directory.GetFiles get today's files only

There is nice function in .NET Directory.GetFiles, it's simple to use it when I need to get all files from directory.
Directory.GetFiles("c:\\Files")
But how (what pattern) can I use to get only files that created time have today if there are a lot of files with different created time?
Thanks!
For performance, especially if the directory search is likely to be large, the use of Directory.EnumerateFiles(), which lazily enumerates over the search path, is preferable to Directory.GetFiles(), which eagerly enumerates over the search path, collecting all matches before filtering any:
DateTime today = DateTime.Now.Date ;
FileInfo[] todaysFiles = new DirectoryInfo(#"c:\foo\bar")
.EnumerateFiles()
.Select( x => {
x.Refresh();
return x;
})
.Where( x => x.CreationTime.Date == today || x.LastWriteTime == today )
.ToArray()
;
Note that the the properties of FileSystemInfo and its subtypes can be (and are) cached, so they do not necessarily reflect current reality on the ground. Hence, the call to Refresh() to ensure the data is correct.
Try this:
var todayFiles = Directory.GetFiles("path_to_directory")
.Where(x => new FileInfo(x).CreationTime.Date == DateTime.Today.Date);
You need to get the directoryinfo for the file
public List<String> getTodaysFiles(String folderPath)
{
List<String> todaysFiles = new List<String>();
foreach (String file in Directory.GetFiles(folderPath))
{
DirectoryInfo di = new DirectoryInfo(file);
if (di.CreationTime.ToShortDateString().Equals(DateTime.Now.ToShortDateString()))
todaysFiles.Add(file);
}
return todaysFiles;
}
You could use this code:
var directory = new DirectoryInfo("C:\\MyDirectory");
var myFile = (from f in directory.GetFiles()
orderby f.LastWriteTime descending
select f).First();
// or...
var myFile = directory.GetFiles()
.OrderByDescending(f => f.LastWriteTime)
.First();
see here: How to find the most recent file in a directory using .NET, and without looping?
using System.Linq;
DirectoryInfo info = new DirectoryInfo("");
FileInfo[] files = info.GetFiles().OrderBy(p => p.CreationTime).ToArray();
foreach (FileInfo file in files)
{
// DO Something...
}
if you wanted to break it down to a specific date you could try this using a filter
var files = from c in directoryInfo.GetFiles()
where c.CreationTime >dateFilter
select c;
You should be able to get through this:
var loc = new DirectoryInfo("C:\\");
var fileList = loc.GetFiles().Where(x => x.CreationTime.ToString("dd/MM/yyyy") == currentDate);
foreach (FileInfo fileItem in fileList)
{
//Process the file
}
var directory = new DirectoryInfo(Path.GetDirectoryName(#"--DIR Path--"));
DateTime from_date = DateTime.Now.AddDays(-5);
DateTime to_date = DateTime.Now.AddDays(5);
//For Today
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file.CreationTime.Date == DateTime.Now.Date ).ToArray();
//For date range + specific file extension
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file => file.CreationTime.Date >= from_date.Date && file.CreationTime.Date <= to_date.Date && file.Extension == ".txt").ToArray();
//To get ReadOnly files from directory
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file => file.IsReadOnly == true).ToArray();
//To get files based on it's size
int fileSizeInKB = 100;
var filesLst = directory.GetFiles().AsEnumerable()
.Where(file => (file.Length)/1024 > fileSizeInKB).ToArray();

Categories

Resources