I have a variable in code that can have file path or url as value. Examples:
http://someDomain/someFile.dat
file://c:\files\someFile.dat
c:\files\someFile.dat
So there are two ways to represent a file and I can't ignore any of them.
What is the correct name for such a variable: path, url, location?
I'm using a 3rd party api so I can't change semantics or separate to more variables.
The first two are URLs, the third is a file path. Of course, the file:/// protocol is only referring to a file also.
When using the Uri class, you can use the IsFile and the LocalPath properties to handle file:/// Uris, and in that case you should also name it like that.
Personally, I'd call the variable in question "fileName"
in fact a formal URL will be file:///c|/files/someFile.dat
urls always starts with protocol:// and then path + names, with '/' as seperator.
evil windows IE sometimes use '\' to replace '/', but the formal usage is '/'.
Pick one that you'll be using internally to start with. If you need to support URLs, use URLs internally everywhere, and have any method that can set the variable check if it got a file path, and coerce it to an URL immediately.
If the values are not opaque to your application you may find it better to model them as a class. Otherwise, whenever you are going to act upon the values you may find yourself writing code like this:
if (variable.StartsWith("http://") || variable.StartsWith("file://")) {
// Handle url
}
else {
// Handle file path
}
You may fold some of the functionality regarding treatment of the values into your class, but it is properly better to treat it as an immutable value type.
Use a descriptive name for your class like FileLocation or whatever fits your nomenclature. It will then be very natural to declare FileLocation variables named fileLocation or inputFileLocation or even fl if you are sloppy.
if the path you are using includes the protocol "file://" then it is in fact a url.
Related
This is the simple C# code I wrote :
string file = #"D:\test(2021/02/10).docx";
var fileName = Path.GetFileNameWithoutExtension(file);
Console.WriteLine(fileName);
I thought I would get the string "test(2021/02/10)" , but I got this result "10)".
How can I solve such a problem?
I just wonder why would you want such behavior. On windows slashes are treated as separator between directory and subdirectory (or file).
So, basically you are not able to create such file name.
And since slashes are treated as described, it is very natural that method implementation just checks what's after last slash and extracts just filename.
If you are interested on how the method is implemented take a look at source code
I asked a similar question recently about using regex to retrieve a URL or folder path from a string. I was looking at this comment by Dour High Arch, where he says:
"I recommend you do not use regexes at all; use separate code paths
for URLs, using the Uri class, and file paths, using the FileInfo
class. These classes already handle parsing, matching, extracting
components, and so on."
I never really tried this, but now I am looking into it and can't figure out if what he said actually is useful to what I'm trying to accomplish.
I want to be able to parse a string message that could be something like:
"I placed the files on the server at http://www.thewebsite.com/NewStuff, they can also
be reached on your local network drives at J:\Downloads\NewStuff"
And extract out the two strings http://www.thewebsite.com/ and J:\Downloads\NewStuff. I don't see any methods on the Uri or FileInfo class that parse a Uri or FileInfo object from a string like I think Dour High Arch was implying.
Is there something I'm missing about using the Uri or FileInfo class that will allow this behavior? If not is there some other class in the framework that does this?
I'd say the easiest way is splitting the strings into parts first.
First delimiter would be spaces, for each word - second would be qoutes (double and single)
Then use Uri.IsWellFormedUriString on each token.
So something like:
foreach(var part in String.Split(new char[]{''', '"', ' '}, someRandomText))
{
if(Uri.IsWellFormedUriString(part, UriKind.RelativeOrAbsolute))
doSomethingWith(part);
}
Just saw at URI.IseWellFormedURIString that this is a bit to strickt to suit your needs maybe.
It returns false if www.Whatever.com is missing the http://
It was not clear from your earlier question that you wanted to extract URL and file path substrings from larger strings. In that case, neither Uri.IsWellFormedUriString nor rRegex.Match will do what you want. Indeed, I do not think any simple method can do what you want because you will have to define rules for ambiguous strings like httX://wasThatAUriScheme/andAre/these part/of/aURL or/are they/separate.strings?andIsThis%20a%20Param?
My suggestion is to define a recursive descent parser and create states for each substring you need to distinguish.
U can use :
(?<type>[^ ]+?:)(?<path>//[^ ]*|\\.+\\[^ ]*)
that will give you 2 groups on each result
type : "http:"
path : //www.thewebsite.com/NewStuff
and
type : "J:"
path : \Downloads\NewStuff
out of the string
"I placed the files on the server at
http://www.thewebsite.com/NewStuff, they can also be reached on your
local network drives at J:\Downloads\NewStuff"
you can use the "type" group to see if the type is http:or not and set action on that.
EDIT
or use regex below if you are sure there is no whitespace in your filepath :
(?<type>[^ ]+?:)(?<path>//[^ ]*|\\[^ ]*)
Try \w+:\S+ and see how well that fits your purposes.
In the file upload function that I am working on it, one important issue is to have the path where I can save the image user uploaded. In the following code, I already specified the folder within the web-based application folder for saving the uploaded files. My instructor told me that I still have a security hole with these following lines in the code-behind of asp.net UploadFile control and I don't know why!!!
string path = #"~\Images\";
string comPath = Server.MapPath(path + "\\" + FileUpload1.FileName);
Could you please tell me how to prevent this kind of security hole?
UPDATE:
Could anyone tell me how to avoid this kind of security hole? I am still trying to find something to make these two lines secure.
My instructor told me that I still have a security hole with these following lines in the code-behind of asp.net UploadFile control and I don't know why!!!
Imagine what would happen if the client send something like ..\..\foo.jpg as file name.
A fairly simple fix is to make sure the file name has no \ or / characters in it, by stripping away everything up to the first such character. (This is a good idea anyway, since browsers vary in their treatment of upload file names: some will send the full path, some just the plain file name and some may even send a fake path instead.)
The easy way to do this is with a Regex:
string fileName = Regex.Replace( FileUpload1.FileName, #"(?s)^.*[\\/]", "" );
Note that, depending on your OS and file system, there are other characters that could potentially cause problems as well. The safest option, in general, is to strip away all characters except those you know are safe, something like this:
string safeFileName = Regex.Replace( fileName, #"[^A-Za-z0-9\-._() ]", "" );
You may also want to further ensure that the file name has an extension that matches the type you expect the file to have.
In my application I build a static string when a user uploads or downloads a file. In that string the filename is passed from the frontend in that string. In this way the user could do things like ..\..\another file.file to tamper and get data from other users. Therefor I need to filter the filename that I get to prevent this. What are the characters that need to be filtered to prevent tampering? I now have the double dot and the back and forward slashes. Is there anything else I should take into consideration? Is there maybe a standard way to do this in C#?
I would suggest using Path.GetInvalidFileNameChars:
public static bool IsValidFileName(string fileName)
{
return fileName.IndexOfAny(Path.GetInvalidFileNameChars()) == -1;
}
.. is typically only dangerous when preceded and/or succeeded by a \ or /, both of which are included in the array returned by GetInvalidFileNameChars. By itself, .. is harmless (unless you’re specifically resolving directory paths), and you shouldn’t forbid it since people might want to introduce ellipses in their filename (e.g. The A...Z of Programming.pdf).
What if different users save a file with the same name? Are you creating a folder for each user?
Most likely what you should be doing is storing the name they provide in a database record, which also contains a pointer to the actual file (which uses a file name which you generate, perhaps a guid). You could also consider using the filestream data type if you'd like to save the document in the database as well.
Nothing good can come from letting your users determine file names on your server :)
I'm downloading files from the Internet inside of my application. Now I'm dealing with multiple file types so I need to able to detect what file type the file is before my application can continue. The problem that I ran into is that some of the URLs where the files are getting downloaded from contain extra parameters.
For example:
http://www.myfaketestsite.com/myaudio.mp3?id=20
Originally I was using String.EndsWith(). Obviously this doesn't work anymore. Any idea on how to detect the file type?
Wrap the URL in a Uri class. It will split it up into different segments that you can use, or you can use the helper methods on the Uri class itself:
var uri = new Uri("http://www.myfaketestsite.com/myaudio.mp3?id=20");
string path = uri.GetLeftPart(UriPartial.Path);
// path = "http://www.myfaketestsite.com/myaudio.mp3"
Your question is a duplicate of:
Truncating Query String & Returning Clean URL C# ASP.net
Get url without querystring
You could always split on the question mark to eliminate the parameters. e.g.
string s = "http://www.myfaketestsite.com/myaudio.mp3?id=20";
string withoutQueryString = s.Split('?')[0];
If no question mark exists, it won't matter, as you'll still be grabbing the value from the zero index. You can then do your logic on the withoutQueryString string.