Split path by "\\" in C# - c#

How can I split a path by "\\"? It gives me a syntax error if I use
path.split("\\");

You should be using
path.Split(Path.DirectorySeparatorChar);
if you're trying to split a file path based on the native path separator.

Try path.Split('\\') --- so single quote (for character)
To use a string this works:
path.Split(new[] {"\\"}, StringSplitOptions.None)
To use a string you have to specify an array of strings. I never did get why :)

There's no string.Split overload which takes a string. (Also, C# is case-sensitive, so you need Split rather than split). However, you can use:
string bits = path.Split('\\');
which will use the overload taking a params char[] parameter. It's equivalent to:
string bits = path.Split(new char[] { '\\' });
That's assuming you definitely want to split by backslashes. You may want to split by the directory separator for the operating system you're running on, in which case Path.DirectorySeparatorChar would probably be the right approach... it will be / on Unix and \ on Windows. On the other hand, that wouldn't help you if you were trying to parse a Windows file system path in an ASP.NET page running on Unix. In other words, it depends on your context :)
Another alternative is to use the methods on Path and DirectoryInfo to get information about paths in more file-system-sensitive ways.

To be on the safe side, you could use:
path.Split(new[] { Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar });

On windows, using forward slashes is also accepted, in C# Path functions and on the command line, in Windows 7/XP at least.
e.g.:
Both of these produce the same results for me:
dir "C:/Python33/Lib/xml"
dir "C:\Python33\Lib\xml"
(In C:)
dir "Python33/Lib/xml"
dir "Python33\Lib\xml"
On windows, neither '/' or '\' are valid chars for filename. On Linux, '\' is ok in filenames, so you should be aware of this if parsing for both.
So if you wanted to support paths in both forms (like I do) you could do:
path.Split(new char[] {'/', '\\'});
On Linux it would probably be safer to use Path.DirectorySeparatorChar.

Path.Split(new char[] { '\\\' });

Better just use the existing class System.IO.Path, so you don't need to care for any system specifications.
It provides methods to access any part of a file path like GetFileName(string path) etc.

A complete solution could look like this:
//
private static readonly char[] pathSeps = new char[] {
Path.DirectorySeparatorChar,
Path.AltDirectorySeparatorChar,
Path.VolumeSeparatorChar,
};
//
///<summary>Split a path according to the file system rules.</summary>
public static string[] SplitPath( string path ) {
if ( null == path ) return null;
return path.Split( pathSeps, StringSplitOptions.RemoveEmptyEntries );
}
Some of the other proposed solutions in this article use the syntax:
path.Split(new char[] {'/', '\'});
Although this will work, it has various disadvantages:
It does not allow your application to adapt to various target platforms. Currently, our applications are basically running on UNIX and Windows OSs (Win, macOS, iOS, linux variations). So there is a fixed set of path characters. But this might change when dotNET were ported to other operating systems. So it is best to use the predefined constants.
Performance of the inline syntax is worse. This might not be of interest for a handful of files, but when working with millions of files there are noticeable differences. The managed memory will go up until next GC. When looking at the generated assembly code you will find "call CORINFO_HELP_NEWARR_1_VC" for each of the 'new' statements, even in Release mode. This happens whenever you new-up any array, because arrays are not immutable. My proposed solution prevents this by declaring the array as readonly and static.
Reusability of the inline syntax also is worse, because you might want to use the path separators array in other contexts.
StringSplitOptions.RemoveEmptyEntries should be used to account for UNC paths and possible typos within the incoming path. The operating systems do not allow duplicate path separators, but there might be a typo from the user or a duplicate concatenation of path separator characters, for example when concatenating the path and filename.

Related

How to extract preceding strings in a given directory

The folder names are variable but I have this constant value in the directory - the "distributions" folder.
How can I extract the all the strings before the "distributions" folder?
> /<root>/win/<usr>/distributions/<dbms>/<repository>/<port
> type>/<remote system>/<port>
Currently I'm doing it in lengthy way (e.g. getting the length of the whole directory, finding the location of distributions word in the string, etc...).
I'm looking for a more elegant way. Could this be done using Regex, or a shorter version of my current implementation?
string.Split followed by TakeWhile can help you
var resultArray = str.Split(new []{#"/"},StringSplitOptions.RemoveEmptyEntries)
.TakeWhile(x=>!x.Equals("distributions"));
Output
<root>
win
<usr>
Update based on Commments
If you need entire path based before "distributions", You can use
var result = str.Split(new []{#"distributions"},StringSplitOptions.RemoveEmptyEntries)
.First();
Output
/<root>/win/<usr>/
string.split('/') will put each "component" of the path (or any string) in an array splitting them by delimiter (/ here). you could then loop through it.
Assuming you do want to get the path up until that point i would recommend using regex here is how i would do it.
Regex regex = new Regex(#".+?(?=distributions)");
Debug.WriteLine(regex.Match("/<root>/win/<usr>/distributions/<dbms>/<repository>").Value);
this outputs
/<root>/win/<usr>/
What is the problem with the good old way?
var s = "/<root>/win/<usr>/distributions/<dbms>/<repository>/<port.....";
var result = s.Substring(0, s.IndexOf("distributions"));
or s.Substring(0, s.IndexOf("/distributions/")+1) if that text might appeare in other form too...

Why is Path.Combine better than "\\"?

Over the years, and again recently, I have heard discussion that everyone should use Path.Combine instead of joining strings together with "\\", for example:
string myFilePath = Path.Combine("c:", "myDoc.txt");
// vs.
string myFilePath = "C:" + "\\myDoc.txt";
I'm failing to see the benefit that the former version provides over the latter and I was hoping someone could explain.
Building a path with Path.Combine is more readable and less error-prone. You don't need to think about directory separator chars(\\ or \ or / on unix, ...) or if the first part of the path does or does not end in \ and whether the second part of the path does or does not start with \.
You can concentrate on the important part, the directories and filenames. It's the same advantage that String.Format has over string concatenation.
When you don't know the first directory (e.g. it comes from user input), you may have C:\Directory or C:\Directory\ and Path.Combine will solve the trailing slash problem for you. It does have quirks with leading slashes for the next arguments though.
Second, while usually not a problem for most applications, with Path.Combine you aren't hard-coding the platform directory separator. For an application that can be deployed to other operating systems than windows this is convenient.
Other platforms can use a different Separator, for example / instead of \ so the reason for not using \\ is to be S.O. independent
In this case, it does not really matter, but why don't you just write:
string myFilePath = "C:\\myDoc.txt";
The Path.Combine() method is useful if you are working with path variables and you don't want to check for backslashes (or whatever slashing required, depending on the platform):
string myFilePath = Path.Combine(path, filename);

Parsing a string to extract a URL or folder path

I asked a similar question recently about using regex to retrieve a URL or folder path from a string. I was looking at this comment by Dour High Arch, where he says:
"I recommend you do not use regexes at all; use separate code paths
for URLs, using the Uri class, and file paths, using the FileInfo
class. These classes already handle parsing, matching, extracting
components, and so on."
I never really tried this, but now I am looking into it and can't figure out if what he said actually is useful to what I'm trying to accomplish.
I want to be able to parse a string message that could be something like:
"I placed the files on the server at http://www.thewebsite.com/NewStuff, they can also
be reached on your local network drives at J:\Downloads\NewStuff"
And extract out the two strings http://www.thewebsite.com/ and J:\Downloads\NewStuff. I don't see any methods on the Uri or FileInfo class that parse a Uri or FileInfo object from a string like I think Dour High Arch was implying.
Is there something I'm missing about using the Uri or FileInfo class that will allow this behavior? If not is there some other class in the framework that does this?
I'd say the easiest way is splitting the strings into parts first.
First delimiter would be spaces, for each word - second would be qoutes (double and single)
Then use Uri.IsWellFormedUriString on each token.
So something like:
foreach(var part in String.Split(new char[]{''', '"', ' '}, someRandomText))
{
if(Uri.IsWellFormedUriString(part, UriKind.RelativeOrAbsolute))
doSomethingWith(part);
}
Just saw at URI.IseWellFormedURIString that this is a bit to strickt to suit your needs maybe.
It returns false if www.Whatever.com is missing the http://
It was not clear from your earlier question that you wanted to extract URL and file path substrings from larger strings. In that case, neither Uri.IsWellFormedUriString nor rRegex.Match will do what you want. Indeed, I do not think any simple method can do what you want because you will have to define rules for ambiguous strings like httX://wasThatAUriScheme/andAre/these part/of/aURL or/are they/separate.strings?andIsThis%20a%20Param?
My suggestion is to define a recursive descent parser and create states for each substring you need to distinguish.
U can use :
(?<type>[^ ]+?:)(?<path>//[^ ]*|\\.+\\[^ ]*)
that will give you 2 groups on each result
type : "http:"
path : //www.thewebsite.com/NewStuff
and
type : "J:"
path : \Downloads\NewStuff
out of the string
"I placed the files on the server at
http://www.thewebsite.com/NewStuff, they can also be reached on your
local network drives at J:\Downloads\NewStuff"
you can use the "type" group to see if the type is http:or not and set action on that.
EDIT
or use regex below if you are sure there is no whitespace in your filepath :
(?<type>[^ ]+?:)(?<path>//[^ ]*|\\[^ ]*)
Try \w+:\S+ and see how well that fits your purposes.

what is the best way to split a string

I have file name which look like
Directory\name-secondName-blabla.txt
If I using string .split my code need to know the separator I am using,
But if in some day I will replace the separator my code will break
Is the any build in way to split to get the following result?
Directory
name
secondNmae
blabla
txt
Thanks
Edit My question is more general than just split file name, is splitting string in general
The best way to split a filename is to use System.IO.Path
You're not clear about what to do with directory1\directory2\ ,
but in general you should use this static class to find the path, name and suffix parts.
After that you will need String.Split() to handle the - separators, you'll just have to make the separator(s) a config setting.
You can make an array with seperators:
string value = "Directory\name-secondName-blabla.txt";
char[] delimiters = new char[] { '\\', '-', '.' };
string[] parts = value.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
var filepath = #"Directory\name-secondName-blabla.txt";
var tokens = filepath.Split(new[]{'\\', '-'});
If you're worried about your separator token changing in the future, set it as a constant in a settings file so you only have to change it in one place. Or, if you think it is going to change regularly, put it in a config file so you don't have to release new builds every time.
As Henk suggested above, use System.IO.Path and its static methods like GetFileNameWithoutExtenstion, GetDirectoryName, etc. Have a look at this link:
http://msdn.microsoft.com/en-us/library/system.io.path.aspx

C# - Regex for file paths e.g. C:\test\test.exe

I am currently looking for a regex that can help validate a file path e.g.:
C:\test\test2\test.exe
I decided to post this answer which does use a regular expression.
^(?:[a-zA-Z]\:|\\\\[\w\.]+\\[\w.$]+)\\(?:[\w]+\\)*\w([\w.])+$
Works for these:
\\test\test$\TEST.xls
\\server\share\folder\myfile.txt
\\server\share\myfile.txt
\\123.123.123.123\share\folder\myfile.txt
c:\folder\myfile.txt
c:\folder\myfileWithoutExtension
Edit: Added example usage:
if (Regex.IsMatch (text, #"^(?:[a-zA-Z]\:|\\\\[\w\.]+\\[\w.$]+)\\(?:[\w]+\\)*\w([\w.])+$"))
{
// Valid
}
*Edit: * This is an approximation of the paths you could see. If possible, it is probably better to use the Path class or FileInfo class to see if a file or folder exists.
I would recommend using the Path class instead of a Regex if your goal is to work with filenames.
For example, you can call Path.GetFullPath to "verify" a path, as it will raise an ArgumentException if the path contains invalid characters, as well as other exceptiosn if the path is too long, etc. This will handle all of the rules, which will be difficult to get correct with a Regex.
This is regular expression for Windows paths:
(^([a-z]|[A-Z]):(?=\\(?![\0-\37<>:"/\\|?*])|\/(?![\0-\37<>:"/\\|?*])|$)|^\\(?=[\\\/][^\0-\37<>:"/\\|?*]+)|^(?=(\\|\/)$)|^\.(?=(\\|\/)$)|^\.\.(?=(\\|\/)$)|^(?=(\\|\/)[^\0-\37<>:"/\\|?*]+)|^\.(?=(\\|\/)[^\0-\37<>:"/\\|?*]+)|^\.\.(?=(\\|\/)[^\0-\37<>:"/\\|?*]+))((\\|\/)[^\0-\37<>:"/\\|?*]+|(\\|\/)$)*()$
And this is for UNIX/Linux paths
^\/$|(^(?=\/)|^\.|^\.\.)(\/(?=[^/\0])[^/\0]+)*\/?$
Here are my tests:
Win Regex
Unix Regex
These works with Javascript
EDIT
I've added relative paths, (../, ./, ../something)
EDIT 2
I've added paths starting with tilde for unix, (~/, ~, ~/something)
The proposed one is not really good, this one I build for XSD, it's Windows specific:
^(?:[a-zA-Z]\:(\\|\/)|file\:\/\/|\\\\|\.(\/|\\))([^\\\/\:\*\?\<\>\"\|]+(\\|\/){0,1})+$
Try this one for Windows and Linux support: ((?:[a-zA-Z]\:){0,1}(?:[\\/][\w.]+){1,})
I use this regex for capturing valid file/folder paths in windows (including UNCs and %variables%), with the exclusion of root paths like "C:\" or "\\serverName"
^(([a-zA-Z]:|\\\\\w[ \w\.]*)(\\\w[ \w\.]*|\\%[ \w\.]+%+)+|%[ \w\.]+%(\\\w[ \w\.]*|\\%[ \w\.]+%+)*)
this regex does not match leading spaces in path elements, so
"C:\program files" is matched
"C:\ pathWithLeadingSpace" is not matched
variables are allowed at any level
"%program files%" is matched
"C:\my path with inner spaces\%my var with inner spaces%" is matched
regex CmdPrompt("^([A-Z]:[^\<\>\:\"\|\?\*]+)");
Basically we look for everything that's not in the list of forbidden Windows Path Characters:
< (less than)
> (greater than)
: (colon)
" (double quote)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
I know this is really old... but expanding on #agent-j's response I've added named groups, and support for period characters.
^(?<ParentPath>(?:[a-zA-Z]\:|\\\\[\w\s\.]+\\[\w\s\.$]+)\\(?:[\w\s\.]+\\)*)(?<BaseName>[\w\s\.]*?)$
I've saved this at Regexr
I found most of the answers here to be a little hit or miss.
Found a good solution here though:
https://social.msdn.microsoft.com/forums/vstudio/en-US/31d2bc84-c948-4914-8a9d-97b9e788b341/validate-a-network-folder-path
Note* - this is only for network shares - not local files
Answer:
string pattern = #"^\\{2}[\w-]+(\\{1}(([\w-][\w-\s]*[\w-]+[$$]?)|([\w-][$$]?$)))+";
string[] names = { #"\\my-network\somelocation", #"\\my-network\\somelocation",
#"\\\my-network\somelocation", #"my-network\somelocation",
#"\\my-network\\somelocation",#"\\my-network\somelocation\aa\dd",
#"\\my-network\somelocation\",#"\\my-network\\somelocation"};
foreach (string name in names)
{
if (Regex.IsMatch(name, pattern))
{
Console.WriteLine(name);
//Directory.Exists function to check if file exists
}
}
Alexander's Answer + Relative Paths
Alexander has the most correct answer thus far since it supports spaces in file names (i.e. C:\Program Files (x86)\ will match)... This aims to include relative paths as well.
For example, you can do cd / or cd \ and it does the same thing.
Further more, if you're currently in C:\some\path\to\some\place and you type either of those commands, you end up at C:\
Even more, you should consider paths, that start with '/' as a root path (to the current drive).
(?:[a-zA-Z]:(\|/)|file://|\\|.(/|\)|/)([^,\/:*\?\<>\"\|]+(\|/){0,1})
A Modified version of Alexander's answer, however, we include paths that are relative with no leading / or drive letter, as well as / with no leading drive letter (relative to the current drive as root).

Categories

Resources