I'm working on a piece of code which creates some files and copies them to a SharePoint group (i.e., a synchronised folder, like OneDrive or Dropbox).
An important detail is that the created file has a very long file path (currently 373 characters).
Once the file has been copied to the SharePoint group - and I can see that it synchronises perfectly well (so the path isn't too long for SharePoint) - I then find...
File.Exists(myLongPath) = false
I've tried calling...
Directory.GetFiles(myLongDirectory, "*.*", SearchOption.AllDirectories)
... and the file with the long path is listed, but if I copy-paste that result back into...
File.Exists(copyPastedPath)
... again the the result is false.
So Directory.GetFiles can see the file but File.Exists can't.
The documentation for FileInfo.Exists says (my emphasis)...
The Exists property returns false if any error occurs while trying to
determine if the specified file exists. This can occur in situations
that raise exceptions such as passing a file name with invalid
characters or too many characters, a failing or missing disk, or if
the caller does not have permission to read the file.
The first case - "invalid characters" - can't be the cause of my problem because I'm testing with a-z, A-Z and 0-9. So my suspicion falls on that second case - "too many characters".
Could it really be the case that the SharePoint group allows for longer path lengths than FileInfo.Exists?
so, I'm trying to create a following directory:
d:\temp\ak\ty\nul
Path is constructed in the loop, starting from: d:\temp and so on, creating non-existent directories along the way, so it first creates:
d:\temp\ak
then:
d:\temp\ak\ty
and.... then it comes to the last bit nul it throws this exception:
So, what's going on - where it took \.\nul from?
The code:
string z_base_path = #"d:\temp\ak\ty";
string z_extra_path = "nul";
string z_full_path = System.IO.Path.Combine(z_base_path, z_extra_path);
System.IO.Directory.CreateDirectory(z_full_path);
In Windows, nul is a reserved file name. No file or directory may be named that. Other reserved names include:
con
prn
aux
com{0-9}
lpt{0-9}
'nul' is a device file meaning that no file/folder can have that name.
instead of
string z_extra_path = "nul";
try
string z_extra_path = "null";
or
string z_extra_path = "";
other ones are
con
aux
com1-9
lpt1-9
prn
Never knew this one until I came against it - it's worth nothing the Windows Directory reserved names and all the rest.
Taken from about article:
https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Folder Naming Conventions
The following fundamental rules enable applications to create and process valid names for files and directories, regardless of the file system:
Use a period to separate the base file name from the extension in the name of a directory or file.
Use a backslash () to separate the components of a path. The backslash divides the file name from the path to it, and one directory name from another directory name in a path. You cannot use a backslash in the name for the actual file or directory because it is a reserved character that separates the names into components.
Use a backslash as required as part of volume names, for example, the "C:" in "C:\path\file" or the "\server\share" in "\server\share\path\file" for Universal Naming Convention (UNC) names. For more information about UNC names, see the Maximum Path Length Limitation section.
Do not assume case sensitivity. For example, consider the names OSCAR, Oscar, and oscar to be the same, even though some file systems (such as a POSIX-compliant file system) may consider them as different. Note that NTFS supports POSIX semantics for case sensitivity but this is not the default behavior. For more information, see CreateFile.
Volume designators (drive letters) are similarly case-insensitive. For example, "D:" and "d:" refer to the same volume.
Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:
The following reserved characters:
< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
Integer value zero, sometimes referred to as the ASCII NUL character.
Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.
Any other character that the target file system does not allow.
Use a period as a directory component in a path to represent the current directory, for example ".\temp.txt". For more information, see Paths.
Use two consecutive periods (..) as a directory component in a path to represent the parent of the current directory, for example "..\temp.txt". For more information, see Paths.
Do not use the following reserved names for the name of a file:
CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended. For more information, see Namespaces.
Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not. However, it is acceptable to specify a period as the first character of a name. For example, ".temp".
Why while using the following code
string filePath = #"C:\test\upload.pdf"
FileStream fs = File.OpenRead(filePath);
raises the following exception?
The filename, directory name, or volume label syntax is incorrect :
'C:\ConsoleApp\bin\Debug\netcoreapp2.1\C:\test\upload.pdf'
From where the C:\ConsoleApp\bin\Debug\netcoreapp2.1\ directory comes from?
Update:
The File.OpenRead() in my case, exists within in an dll & the filePath (C:\test\upload.pdf) is sent via the application that is using the dll.
The string starts with an invisible character, so it's not a valid path. This
(int)#"C:\test\upload.pdf"[0]
Returns
8234
Or hex 202A. That's the LEFT-TO-RIGHT EMBEDDING punctuation character
UPDATE
Raymond Chen posted an article Why is there an invisible U+202A at the start of my file name?.
We saw some time ago that you can, as a last resort, insert the character U+202B (RIGHT-TO-LEFT EMBEDDING) to force text to be interpreted as right-to-left. The converse character is U+202A (LEFT-TO-RIGHT EMBEDDING), which forces text to be interpreted as left-to-right.
The Security dialog box inserts that control character in the file name field in order to ensure that the path components are interpreted in the expected manner. Unfortunately, it also means that if you try to copy the text out of the dialog box, the Unicode formatting control character comes along for a ride. Since the character is normally invisible, it can create all sorts of silent confusion.
(We’re lucky that the confusion was quickly detected by Notepad and the command prompt. But imagine if you had pasted the path into the source code to a C program!)
In the 4 years since that article Notepad got UTF8 support so the character isn't replaced by a question mark. Pasting into the current Windows Console with its incomplete UTF8 support still replaces the character.
The File.OpenRead() in my case, exists within in an dll .
Set the CopyLocal= true in properties of dll in which File.OpenRead exists.
I need to be able to extract the full file path out of this string (without whatever is after the file extension):
$/FilePath/FilePath/KeepsGoing/Folder/Script.sql (CS: 123456)
A simple solution such as the following could would work for this case, however it is only limited to a file extension with 3 characters:
(\$.*\..{3})
However, I find problems with this when the file contains multiple dots:
$/FilePath/FilePath/File.Setup.Task.exe.config (CS: 123456)
I need to be able to capture the full file path (from $ to the end of whatever the file extension is, which can be any number of things). I need to be able to get this no matter how many dots are in the name of the file. In some cases there are spaces in the name of the file too, so I need to be able to incorporate that.
Edit: The ending (CS....) in this case is not standard. All kinds of stuff can follow the path so I cannot predict what will come after the path, but the path will always be first. Sometimes spaces do exist in the file name.
Any suggestions?
Try this:
(\$.*\.[\w.-]+)
But! it will not properly match files with space or special chars in the file extension. If you need to match files that might have special chars in the file extension you'll need to elaborate on the input (is it quoted? is it escaped?).
In my application I build a static string when a user uploads or downloads a file. In that string the filename is passed from the frontend in that string. In this way the user could do things like ..\..\another file.file to tamper and get data from other users. Therefor I need to filter the filename that I get to prevent this. What are the characters that need to be filtered to prevent tampering? I now have the double dot and the back and forward slashes. Is there anything else I should take into consideration? Is there maybe a standard way to do this in C#?
I would suggest using Path.GetInvalidFileNameChars:
public static bool IsValidFileName(string fileName)
{
return fileName.IndexOfAny(Path.GetInvalidFileNameChars()) == -1;
}
.. is typically only dangerous when preceded and/or succeeded by a \ or /, both of which are included in the array returned by GetInvalidFileNameChars. By itself, .. is harmless (unless you’re specifically resolving directory paths), and you shouldn’t forbid it since people might want to introduce ellipses in their filename (e.g. The A...Z of Programming.pdf).
What if different users save a file with the same name? Are you creating a folder for each user?
Most likely what you should be doing is storing the name they provide in a database record, which also contains a pointer to the actual file (which uses a file name which you generate, perhaps a guid). You could also consider using the filestream data type if you'd like to save the document in the database as well.
Nothing good can come from letting your users determine file names on your server :)