Constraint a path to be within a folder - c#

Say you want to store a file within a folder C:\A\B\C and let the user supply the file name.
Just combine them, right?
Wrong.
If the user selects something like \..\..\Ha.txt you might be in for a surprise.
So how do we restrict the result to within C:\A\B\C? It's fine if it's within a subfolder, just not over it.

I've used one of my test projects, it really doesn't matter:
Using c#10
internal class Program
{
static void Main(string[] args)
{
string template = #"F:\Projectes\Test\SourceGenerators";
string folder = #"..\..\..\..\Test1.sln";
Console.WriteLine(MatchDirectoryStructure(template, folder)
? "Match"
: "Doesn't match");
}
static bool MatchDirectoryStructure(string template, string folder)
=> new DirectoryInfo(folder).FullName.StartsWith(template);
}
As you can see, new DirectoryInfo(fileName).FullName; returns the real name of the directory.
From here you can check if it match with the desired result.
In this case the returned value is:
Match

If you're asking for a file name, then it should be just the name of the file. The more control you give to the user about subdirectories, the more they can mess with you.
The idea here is to split your path by both possible slashes (/ and \) and see if the value of any of the entries in the array is ...
string input = #"\..\..\Ha.txt";
bool containsBadSegments = input
.Split(new [] { '/', '\\' })
.Any(s => s is "..");
This answer only takes care of detecting \..\ in the path. There are plenty of other ways to input bad values, such as characters not allowed by the OS's file system, or absolute or rooted paths.

Related

How to combine paths by preserving the original path's directory separator in C#?

If we have /TestDir as an example path, yet we are on a Windows machine, using Path.Join, or Path.Combine with NextDir will yield /TestDir\NextDir.
Is there a way to make it so that if the path I'm appending it to, uses a given separator - the combined path uses the same separator? (Unix/Windows), that is:
\TestDir with NextDir to yield \TestDir\NextDir.
/TestDir with NextDir to yield /TestDir/NextDir.
The first directory will always be a rooted path, meaning it will always contain the path separator to use. The only edge-case is network paths, as they always start with \\ but after that they differ in Unix/Windows? Correct me if I'm wrong on this.
EDIT: I've been told that : is the path separator for Classic Mac - is this true? I don't see any .NET API's that treat this as a directory separator.
This will take the first character (either / or \) that it sees and it will replace all other occurrences of / or \ with the first one that it found.
using System;
public class Example
{
public static void Main()
{
char[] separators = { '\\', '/' };
string path = "/TestDir\\NextDir\\AndTheNext/AndTheNext/AndTheNext\\AndTheNext";
int index = path.IndexOfAny(separators);
path = path[index].ToString() == "\\" ? path.Replace('/', '\\') : path.Replace('\\', '/');
Console.WriteLine(path);
}
}
Check it out running here: https://dotnetfiddle.net/fenzWO
Path class uses a field with the name of: PathSeparator, this one depends on OS and is readonly, so that it's easier to create your own class that performs the same actions than Path but you are able to change the value of PathSeparator.
For more information about Path you may read the docs: https://learn.microsoft.com/en-us/dotnet/api/system.io.path?view=net-5.0

Get the last part of file name in C#

I need get last part means the numeric value(318, 319) of the following text (will vary)
C:\Uploads\X\X-1\37\Misc_318.pdf
C:\Uploads\X\X-1\37\Misc_ 319.pdf
C:\Uploads\X\C-1\37\Misc _ 320.pdf
Once I get that value I need to search for the entire folder. Once I find the files name with matching number, I need to remove all spaces and rename the file in that particular folder
Here is What I want
First get the last part of the file(numeric number may vary)
Based upon the number I get search in the folder to get all files names
Once I get the all files name check for spaces with file name and remove the spaces.
Finding the Number
If the naming follows the convention SOMEPATH\SomeText_[Optional spaces]999.pdf, try
var file = System.IO.Path.GetFileNameWithoutExtension(thePath);
string[] parts = file.split('_');
int number = int.Parse(parts[1]);
Of course, add error checking as appropriate. You may want to check that there are 2 parts after the split, and perhaps use int.TryParse() instead, depending on your confidence that the file names will follow that pattern and your ability to recover if TryParse() returns false.
Constructing the New File Name
I don't fully understand what you want to do once you have the number. However, have a look at Path.Combine() to build a new path if that's what you need, and you can use Directory.GetFiles() to search for a specific file name, or for files matching a pattern, in the desired directory.
Removing Spaces
If you have a file name with spaces in it, and you want all spaces removed, you can do
string newFilename = oldFilename.Replace(" ", "");
Here's a solution using a regex:
var s = #"C:\Uploads\X\X-1\37\Misc_ 319.pdf";
var match = Regex.Match(s, #"^.*?(\d+)(\.\w+)?$");
int i = int.Parse(match.Groups[1].Value);
// do something with i
It should work with or without an extension of any length (as long as it's a single extension, not like my file 123.tar.gz).

Why "test user-doc.doc" ==> TESTUS~1.DOC?

I wrote a c# program and I associated it with file extension like DOC in a PC without MS-Office installed. Then, I double-clicked any file which name contains blank characters, my program will be launched to open that file. I used below statement:
string[] args = Environment.GetCommandLineArgs();
and then args[1] will contain full path file name of that file. Then, I can open it. But the problem now is that if the file name contains blank characters, args[1] contains file name different from the real one. As title, if my file is in e:\tmp3 and file name is test uesr=doc.doc, I expected args[1] contains
"e:\tmp3\test user-doc.doc",
but it actually contains
"E:\tmp3\TESTUS~1.DOC"
Could anyone tell me why and how to resolve it? Thanks.
As already mentioned these are 8.3 file names. If you need to convert from a short name to a full name then you can easily do this with C#.
new FileInfo("E:\tmp3\TESTUS~1.DOC").FullName
Going the other way requires a PInvoke call to GetShortPathName. Be aware that this doesn't work on all NTFS volumes as short names can be turned off but they are turned on by default for the volume the OS is on.
class Program
{
[DllImport("kernel32.dll", SetLastError = true)]
private static extern int GetShortPathName(String pathName, StringBuilder shortName, int cbShortName);
static void Main(string[] args)
{
var fullname = args[0];
var shortPathBuilder = new StringBuilder(fullname.Length);
GetShortPathName(fullname, shortPathBuilder, shortPathBuilder.Length);
var shortname = shortPathBuilder.ToString();
}
}
You should put double quote marks around the %1 replacement in the shell\open\command registry key. For example:
"C:\Program Files\MyApp\MyApp.exe" "%1"
rather than
"C:\Program Files\MyApp\MyApp.exe" %1
If you don't include the double quote marks, Windows detects that filenames with spaces (or other parameter separators) are unlikely to work, and substitutes the short file name. This is for compatibility with 16-bit Windows programs (the HKCR\shell key was introduced for Windows 3.1).
They are called 8.3 Filenames. Basically, they are an alias for the file in the File Allocation Table that shortens the path to the file.
8.3 refers to "8 characters.. then a dot.. then 3 characters". The three characters are the file extension obviously..
Also, you'll note that TESTUS~1 is 8 characters in length.
As far as I am aware, there isn't really much you can do to stop Windows from doing this. You could format your disk to be NTFS I think (I don't think NTFS is so aggresive with file "aliasing").
The issue is with space character (the blank one), as it will consider it
as args[2] i.e. test user-doc.doc will be treated as two args instead of one
due to blank character, so you can use sub string function, with calculating
total number of args as well, then first concatenate all args from args[1] to args[n]
where n is the size of args, this way you can avoid the problem

.net linq with regex ismatch in where

In the following C# method, I know that the Directory.GetFileNsmes() does return the list of files. And, I can add in the Where contains(contact) which works. However for the life of me I can not determine why the searchPatter.IsMatch() fails to find files. I've tested the pattern in http://regexpal.com/ and it qorks as expected. The namePattern is "^\d{3}(.*).pdf" and there should be a match.
public static List<string> GetFileNames(string pathName, string namePattern, string contact)
{
var searchPattern = new Regex(namePattern, RegexOptions.IgnoreCase);
var files = Directory.GetFiles(pathName).Where(f => searchPattern.IsMatch(f));
//.Where(f => f.Contains(contact));
return files.ToList();
}
If this is already answered somewhere please let me know but I've not been able to locate it. I thought this it was pretty simple and straight forward.
Directory.GetFiles will return fill file path which will be Drive\Directory\File.ext. That's why your pattern doesn't seem to match. You need FileName alone as subject. Try this
var files = Directory.GetFiles(pathName)
.Where(f => searchPattern.IsMatch(Path.GetFileName(f)));
Directory.GetFiles() returns a list of filenames appended to the path supplied as a parameter. Your regular expression is "^\d{3}(.*).pdf", that is a string beginning with three digits. If you supplied a string that's an absolute path, it will start with either "/" on Unix or "C:\" on Windows and if it's a relative path, it will start with a directory name. Your code would work if pathName was just an empty string and you were searching the current directory.

File paths with non-ascii characters and FileInfo in C#

I get a string that more or less looks like this:
"C:\\bláh\\bleh"
I make a FileInfo with it, but when I check for its existence it returns false:
var file = new FileInfo(path);
file.Exists;
If I manually rename the path to
"C:\\blah\\bleh"
at debug time and ensure that blah exists with a bleh inside it, then file.Exists starts returning true. So I believe the problem is the non-ascii character.
The actual string is built by my program. One part comes from the AppDomain of the application, which is the part that contains the "á", the other part comes, in a way, from the user. Both parts are put together by Path.Combine. I confirmed the validity of the resulting string in two ways: copying it from the error my program generates, which includes the path, into explorer opens the file just fine. Looking at that string at the debugger, it looks correctly escaped, in that \ are written as \. The "á" is printed literarily by the debugger.
How should I process a string so that even if it has non-ascii characters it turns out to be a valid path?
Here is a method that will handle diacritics in filenames. The success of the File.Exists method depends on how your system stores the filename.
public bool FileExists(string sPath)
{
//Checking for composed and decomposed is to handle diacritics in filenames.
var pathComposed = sPath.Normalize(NormalizationForm.FormC);
if (File.Exists(pathComposed))
return true;
//We really need to check both possibilities.
var pathDecomposed = sPath.Normalize(NormalizationForm.FormD);
if (File.Exists(pathDecomposed))
return true;
return false;
}
try this
string sourceFile = #"C:\bláh\bleh";
if (File.Exists(sourceFile))
{
Console.WriteLine("file exist.");
}
else
{
Console.WriteLine("file does not exist.");
}
Note : The Exists method should not be used for path validation, this method merely checks if the file specified in path exists. Passing an invalid path to Exists returns false.
For path validation you can use Directory.Exists.
I have just manuall created a bláh folder containing a bleh file, and with that in place, this code prints True as expected:
using System;
using System.IO;
namespace ConsoleApplication72
{
class Program
{
static void Main(string[] args)
{
string filename = "c:\\bláh\\bleh";
FileInfo fi = new FileInfo(filename);
Console.WriteLine(fi.Exists);
Console.ReadLine();
}
}
}
I would suggest checking the source of your string - in particular, although your 3k rep speaks against this being the problem, remember that expressing a backslash as \\ is an artifact of C# syntax, and you want to make sure your string actually contains only single \s.
Referring to #adatapost's reply, the list of invalid file name characters (gleaned from System.IO.Path.GetInvalidFileNameChars() in fact doesn't contain normal characters with diacritics.
It looks like the question you're really asking is, "How do I remove diacritics from a string (or in this case, file path)?".
Or maybe you aren't asking this question, and you genuinely want to find a file with name:
c:\blòh\bleh
(or something similar). In that case, you then need to try to open a file with the same name, and not c:\bloh\bleh.
Look like the "bleh" in the path is a directory, not a file. To check if the folder exist use Directory.Exists method.
The problem was: the program didn't have enough permissions to access that file. Fixing the permissions fixed the problem. It seems that when I didn't my experiment I somehow managed to reproduce the permission problem, possibly by creating the folder without the non-ascii character by hand and copying the other one.
Oh... so embarrassing.

Categories

Resources