I have file name which look like
Directory\name-secondName-blabla.txt
If I using string .split my code need to know the separator I am using,
But if in some day I will replace the separator my code will break
Is the any build in way to split to get the following result?
Directory
name
secondNmae
blabla
txt
Thanks
Edit My question is more general than just split file name, is splitting string in general
The best way to split a filename is to use System.IO.Path
You're not clear about what to do with directory1\directory2\ ,
but in general you should use this static class to find the path, name and suffix parts.
After that you will need String.Split() to handle the - separators, you'll just have to make the separator(s) a config setting.
You can make an array with seperators:
string value = "Directory\name-secondName-blabla.txt";
char[] delimiters = new char[] { '\\', '-', '.' };
string[] parts = value.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
var filepath = #"Directory\name-secondName-blabla.txt";
var tokens = filepath.Split(new[]{'\\', '-'});
If you're worried about your separator token changing in the future, set it as a constant in a settings file so you only have to change it in one place. Or, if you think it is going to change regularly, put it in a config file so you don't have to release new builds every time.
As Henk suggested above, use System.IO.Path and its static methods like GetFileNameWithoutExtenstion, GetDirectoryName, etc. Have a look at this link:
http://msdn.microsoft.com/en-us/library/system.io.path.aspx
Related
The folder names are variable but I have this constant value in the directory - the "distributions" folder.
How can I extract the all the strings before the "distributions" folder?
> /<root>/win/<usr>/distributions/<dbms>/<repository>/<port
> type>/<remote system>/<port>
Currently I'm doing it in lengthy way (e.g. getting the length of the whole directory, finding the location of distributions word in the string, etc...).
I'm looking for a more elegant way. Could this be done using Regex, or a shorter version of my current implementation?
string.Split followed by TakeWhile can help you
var resultArray = str.Split(new []{#"/"},StringSplitOptions.RemoveEmptyEntries)
.TakeWhile(x=>!x.Equals("distributions"));
Output
<root>
win
<usr>
Update based on Commments
If you need entire path based before "distributions", You can use
var result = str.Split(new []{#"distributions"},StringSplitOptions.RemoveEmptyEntries)
.First();
Output
/<root>/win/<usr>/
string.split('/') will put each "component" of the path (or any string) in an array splitting them by delimiter (/ here). you could then loop through it.
Assuming you do want to get the path up until that point i would recommend using regex here is how i would do it.
Regex regex = new Regex(#".+?(?=distributions)");
Debug.WriteLine(regex.Match("/<root>/win/<usr>/distributions/<dbms>/<repository>").Value);
this outputs
/<root>/win/<usr>/
What is the problem with the good old way?
var s = "/<root>/win/<usr>/distributions/<dbms>/<repository>/<port.....";
var result = s.Substring(0, s.IndexOf("distributions"));
or s.Substring(0, s.IndexOf("/distributions/")+1) if that text might appeare in other form too...
From path "//source/project/file.cs#232", I need to match file.cs
Match myMatch = Regex.Match(path, #"(\w+\.\w+)[^/]*$");
This would give file.cs in groups[1].
But for paths with dots in the file name, this doesn't work.
path "//source/project/file.initial.config.cs#232"
How could I modify this to work to give file.initial.config.cs?
Try this regex -- also into group 1, and assuming the extension can only be letters, numbers or the underscore:
.*/((?:.*?\.)+\w+)
This could be made more robust, if necessary, with knowledge of the allowable characters and suffixes for file naming, as well as details about the text in which (if) this file name is embedded. For example, if spaces were not allowed as part of the name
.*/((?:\S*?\.)+\w+)
or if ONLY letters, digits or the underscore are allowed:
.*/((?:\w*?\.)+\w+)
If we could be assured that there will be no dots or spaces after the last dot in the sequence, and spaces not allowed in the filename, it could be shortened further to:
.*/(\S*\.\w+)
to pick up everything between the last "/" and the last "." as well as any word characters after the last "."
etc
A number of non-'/' before '#':
/([^/]+)#
This should allow you to do what you want, or at least give you a better idea of how to achieve it:
/(\w+)(?:\..*)(\w{2,3})\#)
• example: http://regex101.com/r/wQ9jG2
Can you not simply modify your regex from (\w+\.\w+)[^/]*$ to (\w+(\.\w+)+)[^/]*$, to allow multiple occurrences of .words?
Why use regex, when you can do it in c# ?
I've created a function for you:
public static class FileNameHelper
{
public static string GetFileNameFromPath(string path, string extWithoutdot = "cs")
{
var startIndex = path.LastIndexOf('/') + 1;
var stringg = path.Substring(startIndex);
var remIndex = stringg.LastIndexOf("." + extWithoutdot) + extWithoutdot.Length+1;
return stringg.Remove(remIndex);
}
}
How to use ?
string filename=FileNameHelper.GetFileNameFromPath("//source/project/file.initial.config.cs#232","cs");
Remember to use the extension without .
See this has a lot of advantage over regex. They are:
Its not regex !
Its fast and efficient.
Its readable and pure c#
Note: Don't use regex in c# for trivial things. It's definitely a blow on the performance. First think of ways of achieving it in c#. Regex should be a last resort. Of course, if performance doesn't matter, use whatever !
By the way, mark it as answer if it helps. I know it'll help :)
If you're not averse to avoiding regular expressions, you could do this with just a small bit of string manipulation:
string mypath = "//source/project/file.initial.config.cs#232";
string filename = GetFileName(mypath);
static string GetFileName(string path)
{
var pathPieces = path.Split('/').Last().Split('#');
var filename = pathPieces.Take(pathPieces.Length - 1);
return String.Join("#", filename);
}
Easier, and works with any arbitrary filename (even those with spaces or # characters).
EDIT: Now works with filenames with # characters in them, although those are highly discouraged in Perforce.
(?<=/)[^/]+(?=#)
Using lookaround, it matches only the filename.
I need to trim a substring from a string, if that substring exists.
Specifically, if the string is "MainGUI.exe", then I need it to become "MainGUI", by trimming ".exe" from the string.
I tried this:
String line = "MainGUI.exe";
char[] exe = {'e', 'x', 'e', '.'};
line.TrimEnd(exe);
This gives me the correct answer for "MainGui.exe", but for something like "MainGUIe.exe" it doesn’t work, giving me "MainGUI" instead of "MainGUIe".
I am using C#. Thanks for the help!
Use the Path static class in System.IO namespace, it lets you strip extensions and directories from file names easily. You can also use it to get the extension, full path, etc. It's a very handy class and well worth looking into.
var filename = Path.GetFileNameWithoutExtension(line);
Gives you "MainGui", this is, of course, assuming you want to trim any file extension or you know your file is always going to be a .exe file, if you want to only trim extensions off of .exe files, however, and leave it on others. You can test first, either by using String.EndsWith() or by using the Path.GetExtension() method.
I would use Path.GetFileNameWithoutExtension instead of string manipulation to handle this.
string line = “MainGUI.exe”;
string fileWithoutExtension = Path.GetFileNameWithoutExtension(line);
If you only want to strip off the extension if it's .exe, you can check for that as well. The following will only strip off extensions of .exe, but leave all other extensions intact:
string ext = Path.GetExtension(line).ToLower();
string fileWithoutExtension = ext == ".exe"
? Path.GetFileNameWithoutExtension(line)
: line;
The Path class has a GetFileNameWithoutExtension.
If you are always trimming ".exe" you can trim the last 4 characters off regardless of the rest of the string.
line.Substring(0, line.Length - ".exe".Length);
string line = "MainGUI.exe";
if (line.EndsWith(".exe"))
line = line.Substring(0, line.Length - 4);
As no file extension has a dot (.) within it, you are safe to use this:
String line = "MainGUI.exe";
line = line.Substring(0, line.LastIndexOf('.'));
How can I split a path by "\\"? It gives me a syntax error if I use
path.split("\\");
You should be using
path.Split(Path.DirectorySeparatorChar);
if you're trying to split a file path based on the native path separator.
Try path.Split('\\') --- so single quote (for character)
To use a string this works:
path.Split(new[] {"\\"}, StringSplitOptions.None)
To use a string you have to specify an array of strings. I never did get why :)
There's no string.Split overload which takes a string. (Also, C# is case-sensitive, so you need Split rather than split). However, you can use:
string bits = path.Split('\\');
which will use the overload taking a params char[] parameter. It's equivalent to:
string bits = path.Split(new char[] { '\\' });
That's assuming you definitely want to split by backslashes. You may want to split by the directory separator for the operating system you're running on, in which case Path.DirectorySeparatorChar would probably be the right approach... it will be / on Unix and \ on Windows. On the other hand, that wouldn't help you if you were trying to parse a Windows file system path in an ASP.NET page running on Unix. In other words, it depends on your context :)
Another alternative is to use the methods on Path and DirectoryInfo to get information about paths in more file-system-sensitive ways.
To be on the safe side, you could use:
path.Split(new[] { Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar });
On windows, using forward slashes is also accepted, in C# Path functions and on the command line, in Windows 7/XP at least.
e.g.:
Both of these produce the same results for me:
dir "C:/Python33/Lib/xml"
dir "C:\Python33\Lib\xml"
(In C:)
dir "Python33/Lib/xml"
dir "Python33\Lib\xml"
On windows, neither '/' or '\' are valid chars for filename. On Linux, '\' is ok in filenames, so you should be aware of this if parsing for both.
So if you wanted to support paths in both forms (like I do) you could do:
path.Split(new char[] {'/', '\\'});
On Linux it would probably be safer to use Path.DirectorySeparatorChar.
Path.Split(new char[] { '\\\' });
Better just use the existing class System.IO.Path, so you don't need to care for any system specifications.
It provides methods to access any part of a file path like GetFileName(string path) etc.
A complete solution could look like this:
//
private static readonly char[] pathSeps = new char[] {
Path.DirectorySeparatorChar,
Path.AltDirectorySeparatorChar,
Path.VolumeSeparatorChar,
};
//
///<summary>Split a path according to the file system rules.</summary>
public static string[] SplitPath( string path ) {
if ( null == path ) return null;
return path.Split( pathSeps, StringSplitOptions.RemoveEmptyEntries );
}
Some of the other proposed solutions in this article use the syntax:
path.Split(new char[] {'/', '\'});
Although this will work, it has various disadvantages:
It does not allow your application to adapt to various target platforms. Currently, our applications are basically running on UNIX and Windows OSs (Win, macOS, iOS, linux variations). So there is a fixed set of path characters. But this might change when dotNET were ported to other operating systems. So it is best to use the predefined constants.
Performance of the inline syntax is worse. This might not be of interest for a handful of files, but when working with millions of files there are noticeable differences. The managed memory will go up until next GC. When looking at the generated assembly code you will find "call CORINFO_HELP_NEWARR_1_VC" for each of the 'new' statements, even in Release mode. This happens whenever you new-up any array, because arrays are not immutable. My proposed solution prevents this by declaring the array as readonly and static.
Reusability of the inline syntax also is worse, because you might want to use the path separators array in other contexts.
StringSplitOptions.RemoveEmptyEntries should be used to account for UNC paths and possible typos within the incoming path. The operating systems do not allow duplicate path separators, but there might be a typo from the user or a duplicate concatenation of path separator characters, for example when concatenating the path and filename.
I'm working on a program that will parse off chunks off data from a CSV file, and populate it into the attributes of an XML document. The data entry I'm working with looks like this...e11*70/157*1999/101*1090*04. I want to break that up, using the asterisks as the reference to split it into e11, 70/157, 1999/101, etc; so I can insert those values into the attributes of the XML. Would this be a situation appropriate for RegEx? Or would I be better off using Substring, with an index of *?
Thanks so much for the assistance. I'm new to the programming world, and have found sites such as these to be a extremely valuable resource.
You can use String.Split()
string[] words = #"e11*70/157*1999/101*1090*04".Split('*');
I think this should solve your ptoblem :
string content = #"11*70/157*1999/101*1090*04";
string [] split = words.Split('*');
You could use the Split method to create a string array like so:
string txt = "e11*70/157*1999/101*1090*04";
foreach (string s in txt.Split('*')){
DoSomething(s);
}