I am using Directory.GetDirectories() with a Linq statement to loop through all directories in a folder that aren't system folders, however I am discovering a bunch of bad ReparsePoints in the folder, which is causing the method to take a long time as it times out on each bad reparse point.
The code I am currently using looks like this:
subdirectories = directory.GetDirectories("*", SearchOption.TopDirectoryOnly)
.Where(d => ((d.Attributes & FileAttributes.Hidden) != FileAttributes.Hidden)
&& ((d.Attributes & FileAttributes.System) != FileAttributes.System));
I have also tried using code like this for testing, but it also hangs for a full minute or so on the bad folders:
foreach (var item in dir.GetDirectories("*", SearchOption.TopDirectoryOnly))
{
Console.WriteLine(item.Name);
Console.WriteLine(item.Attributes);
}
It should be noted that the above bit of code works fine in .Net 4.0, but in 3.5 it will hang for a minute on each bad reparse point.
Trying to open these folders manually in Windows Explorer results in a "Network Path Not Found" error.
Is there another way to loop through good subfolders inside a folder that doesn't use the Attributes property, or that bypasses the bad reparse points?
I have already tried using Directory.Exists(), and that is equally slow.
According to this answer: *FASTEST* directory listing
For the best performance, it is possible to P/Invoke NtQueryDirectoryFile, documented as ZwQueryDirectoryFile
From MSDN: FILE_REPARSE_POINT_INFORMATION structure
This information can be queried in either of the following ways:
Call ZwQueryDirectoryFile, passing FileReparsePointInformation as the value of FileInformationClass and passing a caller-allocated, FILE_REPARSE_POINT_INFORMATION-structured buffer as the value of FileInformation.
Create an IRP with major function code IRP_MJ_DIRECTORY_CONTROL and minor function code IRP_MN_QUERY_DIRECTORY.
Related
I am using a WPF TextBoxes inside my WinForm application for spell checking. Each time I create one, I load the same file in as a CustomDictionary. All has been fine until recently. Now, they take a long time to load, up to a second. Some forms have 30 or more, meaning delays of nearly half a minute. This seems to be the case Windows 10 (not Windows 8 as I originally posted). The application is running under DotNet 4.0, I have tried 4.5 and 4.6 (not 4.61) and all versions are slow.
I have seen sfaust’s question Spell check textbox in Win10 - Slow and am7zd’s answer. Thanks to these, I looked at the GLOBAL registry key in HKEY_CURRENT_USER\Software\Microsoft\Spelling\Dictionaries. I have 580 entries (after pruning out entries without matching files) and still things are slow.
At present, every time I create a TextBox and add a custom dictionary to it, a new entry seems to be generated in _GLOBAL_
Is there a better way of doing things than loading the custom dictionary in from file every time?
Is there a way of re-using the same entry in _GLOBAL_ every time instead of creating a new one?
Is there a clean way of clearing previous entries in GLOBAL created by my application and their matching .dic files when closing the application (or on restarting it)?
I could clear _GLOBAL_ completely each time I start my application. This brings back the speed I want, but what is the downside?
Any advice gratefully received.
No answers from anyone else, so this is what I have done:
I made sure I use CustomDictionaries.Remove on all textboxes with custom dictionaries before closing the form they are on. This gets rid of new entries in _GLOBAL_ and the related files in AppData\Local\Temp.
But there will be times when things go wrong or the user just ends the task, leaving _GLOBAL_ entries and .dic files in place, so:
I decided to take things a stage further. When I start my application, I will not only clean entries in _GLOBAL_ that don't have matching files (as suggested in the previous post referenced above), but also to remove all entries referring to .dic files in AppData\Local\Temp. My theory being that anyone who has left entries there didn't mean to, otherwise they would probably have saved the .dic file in a different folder (as Microsoft Office does).
try
{
string[] allDictionaries = (string[])Registry.GetValue(#"HKEY_CURRENT_USER\Software\Microsoft\Spelling\Dictionaries", "_Global_", new string[0]);
if (allDictionaries.Count() > 0)
{
List<string> realDictionaries = new List<string>();
bool changedSomething = false;
foreach (string thisD in allDictionaries)
{
if (File.Exists(thisD))
{
if (thisD.Contains(#"\AppData\Local\Temp\"))
{
// Assuming that anyone who wants to keep a permanent .dic file will not store it in \AppData\Local\Temp
// So delete the file and don't copy the name of the dictionary into the list of good dictionaries.
File.Delete(thisD);
changedSomething = true;
}
else
{
realDictionaries.Add(thisD);
}
}
else
{
// File does not exist, so don't copy the name of the dictionary into the list of good dictionaries.
changedSomething = true;
}
}
if (changedSomething)
{
Registry.SetValue(#"HKEY_CURRENT_USER\Software\Microsoft\Spelling\Dictionaries", "_Global_", realDictionaries.ToArray());
}
}
}
catch (Exception ex)
{
MessageBox.Show(this, "Error clearing up old dictionary files.\n\nFull message:\n\n" + ex.Message, "Unable to delete file", MessageBoxButtons.OK, MessageBoxIcon.Warning);
}
I am still wondering if it is totally safe to clear entries in _GLOBAL_ that refer to files in AppData\Local\Temp. Surely people shouldn't be leaving important stuff in a temp folder... should they?
What would be really nice would be an overload to CustomDictionaries.Add that allows us to set the name and folder of the .dic file, allowing all the textboxes in the same application to share the same .dic file and making sure we don't leave a load of redundant entries and files with seemingly random names hanging around in the first place..... please Microsoft.
Question Background:
I have a WebApi controller who's logic code relies on reading data contained in a number of XML files. These XML files have been included in the App_Data folder of the WebApi project.
The Issue:
I'm trying to use the relative path of the XML files in the following way:
[System.Web.Http.HttpGet]
public string CallerOne()
{
string docOne = #"~\AppData\DocOne.xml";
string poll = #"~\AppData\Poll.xml";
var response = _Caller.CallService(docOne, poll);
return ConvertXmlToJson(response);
}
When running the WebApi code and calling the Url to the CallerOne method I receive the following error:
An exception of type 'System.IO.DirectoryNotFoundException'
occurred in System.Xml.dll but was not handled in user code
Additional information: Could not find a part of the path
'C:\Program Files (x86)\IIS Express\~\AppData\FPS.xml'.
I also want to eventually publish this to Azure and include these files.
How can I use the relative path to read in the XML files in the App_Data folder?
Ended up finding the answer.
The following is needed to read the relative paths in a WebApi project:
var fullPath = System.Web.Hosting.HostingEnvironment.MapPath(#"~/App_Data/yourXmlFile.xml");
As jdweng inferred several months back, Environment.GetEnvironmentVariable("AppData") would seem to be the preferred method. The OP's auto-accepted answer and that give quite different results. For example, using both of those in my project, I get:
C:\\Projects\\PlatypusReports\\PlatypusReports\\App_Data\\yourXmlFile.xml
...for the OP's long-winded code, namely this:
var fullPath = System.Web.Hosting.HostingEnvironment.MapPath(#"~/App_Data/yourXmlFile.xml");
...and this:
C:\\Users\\cshannon\\AppData\\Roaming
...for jdweng's code, to wit:
string appData = Environment.GetEnvironmentVariable("AppData");
OTOH, this code:
string appDataFolder = HttpContext.Current.Server.MapPath("~/App_Data/");
returns:
C:\\Projects\\PlatypusReports\\PlatypusReports\App_Data\
So it's very similar in results (if not methodology) to the first example above. I actually got it from a question I asked almost two years ago, which I had forgotten about.
I'm not positive if jdweng's approach would work as expected once the app is deployed on a server, but I have much more confidence in it than the other approaches.
Can anyone verify?
UPDATE
The accepted answer here has 237 upvotes at time of typing, so seems pretty reliable, albeit 6 years old (42 in dog years, which may be a good sign).
Your approach is fine. You just had some tpying error,
You wrote
string docOne = #"~\AppData\DocOne.xml";
But it should have been
string docOne = #"~\App_Data\DocOne.xml";
On a Windows 7 (or server) box, we have a folder on a UNC share (cross machine UNC, not localhost). We rename that folder, and then check for the existence of a file at the new folder location. Even though it exists, it takes almost 5 seconds for File.Exists to return true on it.
Full repro can be found on https://github.com/davidebbo/NpmFolderRenameIssue. Here is the core code:
// This file doesn't exist yet
// Note that the presence of this existence check is what triggers the bug below!!
Console.WriteLine("Exists (should be false): " + File.Exists("test/test2/myfile"));
// Create a directory, with a file in it
Directory.CreateDirectory("test/subdir/test");
File.WriteAllText("test/subdir/test/myfile", "Hello");
// Rename the directory
Directory.Move("test/subdir/test", "test/test2");
var start = DateTime.UtcNow;
// List the files at the new location. Here, our file shows up fine
foreach (var path in Directory.GetFiles("test/test2"))
{
Console.WriteLine(path);
}
for (; ; )
{
// Now do a simple existence test. It should also be true, but when
// running on a (cross machine) UNC share, it takes almost 5 seconds to become true!
if (File.Exists("test/test2/myfile")) break;
Console.WriteLine("After {0} milliseconds, test/test2/myfile doesn't show as existing",
(DateTime.UtcNow - start).TotalMilliseconds);
Thread.Sleep(100);
}
Console.WriteLine("After {0} milliseconds, test/test2/myfile correctly shows as existing!",
(DateTime.UtcNow - start).TotalMilliseconds);
So it seems like the initial existence check causes the existence value to be cached, causing this bogus behavior.
Questions: what is the explanation for this? What's the best way to avoid it?
NOTE: this issue initially arose when using npm (Node Package Manager) on Windows. The code I have here is a C# port of the repro. See https://github.com/isaacs/npm/issues/2230 for the original Node/npm issue. The goal is to find a way to address it.
David,
The redirector implements a negative "File Not Found" cache which prevents a client from flooding a server with file not found requests. The default cache time is 5 seconds but you can modify the FileNotFoundCacheLifetime registry value to control the cache or disable it by setting this value to 0.
Details: http://technet.microsoft.com/en-us/library/ff686200(v=WS.10).aspx
There are multiple levels of caching in network code. This could slow down the time the file existence finally shows up.
A solution would be not to use file shares but create a simple client/server architecture where the server returns the file existence from the local file system. That should really speed up item detection times.
My guess would be that if you tried to open the file even if File.Exists says it doesn't exist yet it should be opened correctly so you can use the server existence information. If that won't work you can simply add a download option to the server/client architecture.
Once I knew about the "File Not Found" cache, I was able to get around the problem by using a FileInfo object, which implements a Refresh() method. Your code could do this instead:
FileInfo testFile = new FileInfo("test/test2/myfile");
Console.WriteLine("Exists (should be false): " + testFile .Exists);
Directory.Move("test/subdir/test", "test/test2");
testFile.Refresh();
// the FileInfo object should now be refreshed, and a second call to Exists will return a valid value
if (testFile.Exists)
{
...
}
When I recursive through some folders and files, I encounter this error:
The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directoryname must be less than 248 characters.
Here's my function
private void ProcessDirectory(DirectoryInfo di)
{
try
{
DirectoryInfo[] diArr = di.GetDirectories();
foreach (DirectoryInfo directoryInfo in diArr)
{
if (StopCheck)
return;
ProcessDirectory(directoryInfo);
}
ProcessFile(di);
}
catch (Exception e)
{
listBoxError.Items.Add(e.Message);
}
TextBoxCurrentFolder.Text = di.ToString();
}
I cannot make the directory names shorter, because I'm not allowed too so... How can I solve this problem?
Added:
Here's the other function:
private void ProcessFile(DirectoryInfo di)
{
try
{
FileInfo[] fileInfo = di.GetFiles();
if (fileInfo.LongLength != 0)
{
foreach (FileInfo info in fileInfo)
{
Size += info.Length;
CountFile++;
}
}
}
catch (Exception e)
{
listBoxError.Items.Add(e.Message);
}
}
EDIT
Found this where he used Zeta Long Paths:
How can I use FileInfo class, avoiding PathTooLongException?
Have implemented it and now i'm going to let the program run over the night to see if it works.
EDIT
Used the ZetaLongPath yesterday and it worked great! It even went through folders that needed permission access.
EDIT
Instead of zetalongPath, I've used Delimon.Win32.IO.dll which i think is much better. It has the same interfaces as Win32.
Here's more info about the Delimon library referred to earlier. Its a .NET Framework 4 based library on Microsoft TechNet for overcoming the long filenames problem:
Delimon.Win32.IO Library (V4.0).
It has its own versions of key methods from System.IO. For example, you would replace:
System.IO.Directory.GetFiles
with
Delimon.Win32.IO.Directory.GetFiles
which will let you handle long files and folders.
From the website:
Delimon.Win32.IO replaces basic file functions of System.IO and
supports File & Folder names up to up to 32,767 Characters.
This Library is written on .NET Framework 4.0 and can be used either
on x86 & x64 systems. The File & Folder limitations of the standard
System.IO namespace can work with files that have 260 characters in a
filename and 240 characters in a folder name (MAX_PATH is usually
configured as 260 characters). Typically you run into the
System.IO.PathTooLongException Error with the Standard .NET Library.
This is a known limitation in Windows: http://msdn.microsoft.com/en-us/library/aa365247.aspx
I don't believe you'll be able to get around it, so whoever is telling you that you aren't allowed to make them shorter, you'll have a pretty solid argument as to why you have to.
The only real alternative is to move the deep folder somewhere else, maybe right at the root of your drive.
EDIT: Actually there may be a workaround: http://www.codinghorror.com/blog/2006/11/filesystem-paths-how-long-is-too-long.html
You'll have to use P/Invoke and the Unicode version of the Win32 API functions. You'll need FindFirstFile, FindNextFile and FindClose functions.
Also see:
C# deleting a folder that has long paths
DirectoryInfo, FileInfo and very long path
You can use the subst command. It creates a virtual drive starting at whatever folder you pass as parameter.
For example, you can turn the path c:\aaaaaaaaaaaaaaaaaaaaaa\aaaaaaaaaaaaaaaaaaaa\aaaaaaaaaaaaaa into the drive R: and continue exploring the subfolders of c:\aaaaaaaaaaaaaaaaaaaaaa\aaaaaaaaaaaaaaaaaaaa\aaaaaaaaaaaaaa thru R:...
Do you know what I mean?
I also recommend reading this three-part blog post from the BCL Team, published in 2007, but relating specifically to the limitations of DirectoryInfo when it comes to deeply nested folders. It covers the history of the MAX_PATH limitation, the newer \?\ path format, and various .NET-based solutions and workarounds.
Comprehensive, though perhaps a bit dated.
I'm trying to create an application which scans a drive. The tricky part though, is that my drive contains a set of folders that have folders within folders and contain documents. I'm trying to scan the drive, take a "snapshot" of all documents & folders and dump into a .txt file.
The first time i run this app, the output will be a text file with all the folders & files.
The second time i run this application, it will take the 2 text files (the one produced from the 2nd time i run the app and the .txt file from the 1st time i have run the app) and compare them...reporting what has been moved/overridden/deleted.
Does anybody have any code for this? I'm a newbie at this C# stuff and any help would be greatly appreciated.
Thanks in advance.
One thing that we learned in the 80's was that if it's really tempting to use recursion for file system walking, but the moment you do that, someone will make a file system with nesting levels that will cause your stack to overflow. It's far better to use heap-based walking of the file system.
Here is a class I knocked together which does just that. It's not super pretty, but it does the job quite well:
using System;
using System.IO;
using System.Collections.Generic;
namespace DirectoryWalker
{
public class DirectoryWalker : IEnumerable<string>
{
private string _seedPath;
Func<string, bool> _directoryFilter, _fileFilter;
public DirectoryWalker(string seedPath) : this(seedPath, null, null)
{
}
public DirectoryWalker(string seedPath, Func<string, bool> directoryFilter, Func<string, bool> fileFilter)
{
if (seedPath == null)
throw new ArgumentNullException(seedPath);
_seedPath = seedPath;
_directoryFilter = directoryFilter;
_fileFilter = fileFilter;
}
public IEnumerator<string> GetEnumerator()
{
Queue<string> directories = new Queue<string>();
directories.Enqueue(_seedPath);
Queue<string> files = new Queue<string>();
while (files.Count > 0 || directories.Count > 0)
{
if (files.Count > 0)
{
yield return files.Dequeue();
}
if (directories.Count > 0)
{
string dir = directories.Dequeue();
string[] newDirectories = Directory.GetDirectories(dir);
string[] newFiles = Directory.GetFiles(dir);
foreach (string path in newDirectories)
{
if (_directoryFilter == null || _directoryFilter(path))
directories.Enqueue(path);
}
foreach (string path in newFiles)
{
if (_fileFilter == null || _fileFilter(path))
files.Enqueue(path);
}
}
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
Typical usage is this:
DirectoryWalker walker = new DirectoryWalker(#"C:\pathToSource\src", null, (x => x.EndsWith(".cs")));
foreach (string s in walker)
{
Console.WriteLine(s);
}
Which recursively lists all files that end in ".cs"
A better approach than your text file comparisons would be to use the FileSystemWatcher Class.
Listens to the file system change notifications and raises events when a directory, or file in a directory, changes.
You could log the changes and then generate your reports as needed from that log.
you can easily utilize the DirectoryInfo/FileInfo classes for this.
Basically instantiate an instance of the DirectoryInfo class, pointing towards the c:\ folder. Then using it's objects walk the folder structure.
http://msdn.microsoft.com/en-us/library/system.io.directoryinfo.aspx has code that could quite easily be translated.
Now, the other part of your question is insanity. You can find the differences between the two files relatively easily, but translating that into what has been moved/deleted/etc will take the some fairly advanced logic structures. After all, if I have two files, both named myfile.dat, and one is found at c:\foo and the other at c:\notfoo, how would the one at c:\notfoo be reported if I deleted the one at c:\foo? Another example, is if I have a file myfile2.dat and copy it from c:\bar to c:\notbar is that considered a move? What happens if I copy it on Tuesday, and then on Thursday I delete c:\bar\myfile2.dat--is that a move or a delete? And would the answer change if I ran the program on every Monday as opposed to daily?
There's a whole host of questions, and their corresponding logic structures which you'd need to think of amd code for in order to build that functionality, and even then, it would not be 100% correct, because it's not paging the file system as changes occur--there will always exist the possibility of a scenario that did not get reported correctly in your logic due to timing, logic structure, process time, when the app runs, or just due to the sheer perversity of computers.
Additionally, the processing time would grow exponentially with the size of your drive. After all, you'd need to check every file against every other file to determine it's state as opposed to its previous state. I'd hate to have to run this against my 600+GB drive at home, let alone the 40TB drives I have on servers at work.