C# Sanitize File Name

C# Sanitize File Name - c#

I recently have been moving a bunch of MP3s from various locations into a repository. I had been constructing the new file names using the ID3 tags (thanks, TagLib-Sharp!), and I noticed that I was getting a System.NotSupportedException:
"The given path's format is not supported."
This was generated by either File.Copy() or Directory.CreateDirectory().
It didn't take long to realize that my file names needed to be sanitized. So I did the obvious thing:
public static string SanitizePath_(string path, char replaceChar)
{
string dir = Path.GetDirectoryName(path);
foreach (char c in Path.GetInvalidPathChars())
dir = dir.Replace(c, replaceChar);
string name = Path.GetFileName(path);
foreach (char c in Path.GetInvalidFileNameChars())
name = name.Replace(c, replaceChar);
return dir + name;
}
To my surprise, I continued to get exceptions. It turned out that ':' is not in the set of Path.GetInvalidPathChars(), because it is valid in a path root. I suppose that makes sense - but this has to be a pretty common problem. Does anyone have some short code that sanitizes a path? The most thorough I've come up with this, but it feels like it is probably overkill.
// replaces invalid characters with replaceChar
public static string SanitizePath(string path, char replaceChar)
{
// construct a list of characters that can't show up in filenames.
// need to do this because ":" is not in InvalidPathChars
if (_BadChars == null)
{
_BadChars = new List<char>(Path.GetInvalidFileNameChars());
_BadChars.AddRange(Path.GetInvalidPathChars());
_BadChars = Utility.GetUnique<char>(_BadChars);
}
// remove root
string root = Path.GetPathRoot(path);
path = path.Remove(0, root.Length);
// split on the directory separator character. Need to do this
// because the separator is not valid in a filename.
List<string> parts = new List<string>(path.Split(new char[]{Path.DirectorySeparatorChar}));
// check each part to make sure it is valid.
for (int i = 0; i < parts.Count; i++)
{
string part = parts[i];
foreach (char c in _BadChars)
{
part = part.Replace(c, replaceChar);
}
parts[i] = part;
}
return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
}
Any improvements to make this function faster and less baroque would be much appreciated.

To clean up a file name you could do this
private static string MakeValidFileName( string name )
{
string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
string invalidRegStr = string.Format( #"([{0}]*\.+$)|([{0}]+)", invalidChars );
return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}

A shorter solution:
var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');

Based on Andre's excellent answer but taking into account Spud's comment on reserved words, I made this version:
/// <summary>
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// </summary>
/// <remarks>
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// </remarks>
public static string CoerceValidFileName(string filename)
{
var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
var invalidReStr = string.Format(#"[{0}]+", invalidChars);
var reservedWords = new []
{
"CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
"COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
"LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
};
var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
foreach (var reservedWord in reservedWords)
{
var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
}
return sanitisedNamePart;
}
And these are my unit tests
[Test]
public void CoerceValidFileName_SimpleValid()
{
var filename = #"thisIsValid.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual(filename, result);
}
[Test]
public void CoerceValidFileName_SimpleInvalid()
{
var filename = #"thisIsNotValid\3\\_3.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}
[Test]
public void CoerceValidFileName_InvalidExtension()
{
var filename = #"thisIsNotValid.t\xt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("thisIsNotValid.t_xt", result);
}
[Test]
public void CoerceValidFileName_KeywordInvalid()
{
var filename = "aUx.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("_reservedWord_.txt", result);
}
[Test]
public void CoerceValidFileName_KeywordValid()
{
var filename = "auxillary.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("auxillary.txt", result);
}

string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));

there are a lot of working solutions here. just for the sake of completeness, here's an approach that doesn't use regex, but uses LINQ:
var invalids = Path.GetInvalidFileNameChars();
filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_'));
Also, it's a very short solution ;)

I'm using the System.IO.Path.GetInvalidFileNameChars() method to check invalid characters and I've got no problems.
I'm using the following code:
foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars())
{
filename = filename.Replace(invalidchar, '_');
}

I wanted to retain the characters in some way, not just simply replace the character with an underscore.
One way I thought was to replace the characters with similar looking characters which are (in my situation), unlikely to be used as regular characters. So I took the list of invalid characters and found look-a-likes.
The following are functions to encode and decode with the look-a-likes.
This code does not include a complete listing for all System.IO.Path.GetInvalidFileNameChars() characters. So it is up to you to extend or utilize the underscore replacement for any remaining characters.
private static Dictionary<string, string> EncodeMapping()
{
//-- Following characters are invalid for windows file and folder names.
//-- \/:*?"<>|
Dictionary<string, string> dic = new Dictionary<string, string>();
dic.Add(#"\", "Ì"); // U+OOCC
dic.Add("/", "Í"); // U+OOCD
dic.Add(":", "¦"); // U+00A6
dic.Add("*", "¤"); // U+00A4
dic.Add("?", "¿"); // U+00BF
dic.Add(#"""", "ˮ"); // U+02EE
dic.Add("<", "«"); // U+00AB
dic.Add(">", "»"); // U+00BB
dic.Add("|", "│"); // U+2502
return dic;
}
public static string Escape(string name)
{
foreach (KeyValuePair<string, string> replace in EncodeMapping())
{
name = name.Replace(replace.Key, replace.Value);
}
//-- handle dot at the end
if (name.EndsWith(".")) name = name.CropRight(1) + "°";
return name;
}
public static string UnEscape(string name)
{
foreach (KeyValuePair<string, string> replace in EncodeMapping())
{
name = name.Replace(replace.Value, replace.Key);
}
//-- handle dot at the end
if (name.EndsWith("°")) name = name.CropRight(1) + ".";
return name;
}
You can select your own look-a-likes. I used the Character Map app in windows to select mine %windir%\system32\charmap.exe
As I make adjustments through discovery, I will update this code.

I think the problem is that you first call Path.GetDirectoryName on the bad string. If this has non-filename characters in it, .Net can't tell which parts of the string are directories and throws. You have to do string comparisons.
Assuming it's only the filename that is bad, not the entire path, try this:
public static string SanitizePath(string path, char replaceChar)
{
int filenamePos = path.LastIndexOf(Path.DirectorySeparatorChar) + 1;
var sb = new System.Text.StringBuilder();
sb.Append(path.Substring(0, filenamePos));
for (int i = filenamePos; i < path.Length; i++)
{
char filenameChar = path[i];
foreach (char c in Path.GetInvalidFileNameChars())
if (filenameChar.Equals(c))
{
filenameChar = replaceChar;
break;
}
sb.Append(filenameChar);
}
return sb.ToString();
}

I have had success with this in the past.
Nice, short and static :-)
public static string returnSafeString(string s)
{
foreach (char character in Path.GetInvalidFileNameChars())
{
s = s.Replace(character.ToString(),string.Empty);
}
foreach (char character in Path.GetInvalidPathChars())
{
s = s.Replace(character.ToString(), string.Empty);
}
return (s);
}

Here's an efficient lazy loading extension method based on Andre's code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace LT
{
public static class Utility
{
static string invalidRegStr;
public static string MakeValidFileName(this string name)
{
if (invalidRegStr == null)
{
var invalidChars = System.Text.RegularExpressions.Regex.Escape(new string(System.IO.Path.GetInvalidFileNameChars()));
invalidRegStr = string.Format(#"([{0}]*\.+$)|([{0}]+)", invalidChars);
}
return System.Text.RegularExpressions.Regex.Replace(name, invalidRegStr, "_");
}
}
}

Your code would be cleaner if you appended the directory and filename together and sanitized that rather than sanitizing them independently. As for sanitizing away the :, just take the 2nd character in the string. If it is equal to "replacechar", replace it with a colon. Since this app is for your own use, such a solution should be perfectly sufficient.

using System;
using System.IO;
using System.Linq;
using System.Text;
public class Program
{
public static void Main()
{
try
{
var badString = "ABC\\DEF/GHI<JKL>MNO:PQR\"STU\tVWX|YZA*BCD?EFG";
Console.WriteLine(badString);
Console.WriteLine(SanitizeFileName(badString, '.'));
Console.WriteLine(SanitizeFileName(badString));
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
private static string SanitizeFileName(string fileName, char? replacement = null)
{
if (fileName == null) { return null; }
if (fileName.Length == 0) { return ""; }
var sb = new StringBuilder();
var badChars = Path.GetInvalidFileNameChars().ToList();
foreach (var #char in fileName)
{
if (badChars.Contains(#char))
{
if (replacement.HasValue)
{
sb.Append(replacement.Value);
}
continue;
}
sb.Append(#char);
}
return sb.ToString();
}
}

Based #fiat's and #Andre's approach, I'd like to share my solution too.
Main difference:
its an extension method
regex is compiled at first use to save some time with a lot executions
reserved words are preserved
public static class StringPathExtensions
{
private static Regex _invalidPathPartsRegex;
static StringPathExtensions()
{
var invalidReg = System.Text.RegularExpressions.Regex.Escape(new string(Path.GetInvalidFileNameChars()));
_invalidPathPartsRegex = new Regex($"(?<reserved>^(CON|PRN|AUX|CLOCK\\$|NUL|COM0|COM1|COM2|COM3|COM4|COM5|COM6|COM7|COM8|COM9|LPT0|LPT1|LPT2|LPT3|LPT4|LPT5|LPT6|LPT7|LPT8|LPT9))|(?<invalid>[{invalidReg}:]+|\\.$)", RegexOptions.Compiled);
}
public static string SanitizeFileName(this string path)
{
return _invalidPathPartsRegex.Replace(path, m =>
{
if (!string.IsNullOrWhiteSpace(m.Groups["reserved"].Value))
return string.Concat("_", m.Groups["reserved"].Value);
return "_";
});
}
}

Related

Split special string in c#

I want to split the below string with given output.
Can anybody help me to do this.
Examples:
/TEST/TEST123
Output: /Test/
/TEST1/Test/Test/Test/
Output: /Test1/
/Text/12121/1212/
Output: /Text/
/121212121/asdfasdf/
Output: /121212121/
12345
Output: 12345
I have tried string.split function but it is not worked well. Is there any idea or logic that i can implement to achieve this situation.
If the answer in regular expression that would be fine for me.

You simply want the first result of Spiting by /
string output = input.Split('/')[0];
But in case that you have //TEST/ and output should be /TEST you can use regex.
string output = Regex.Matches(input, #"\/?(.+?)\/")[0].Groups[1].Value;
For your 5th case : you have to separate the logic. for example:
public static string Method(string input)
{
var split = input.Split(new[] {'/'}, StringSplitOptions.RemoveEmptyEntries);
if (split.Length == 0) return input;
return split[0];
}
Or using regex.
public static string Method(string input)
{
var matches = Regex.Matches(input, #"\/?(.+?)\/");
if (matches.Count == 0) return input;
return matches[0].Groups[1].Value;
}
Some results using method:
TEST/54/ => TEST
TEST => TEST
/TEST/ => TEST

I think this would work:
string s1 = "/TEST/TEST123";
string s2 = "/TEST1/Test/Test/Test/";
string s3 = "/Text/12121/1212/";
string s4 = "/121212121/asdfasdf/";
string s5 = "12345";
string pattern = #"\/?[a-zA-Z0-9]+\/?";
Console.WriteLine(Regex.Matches(s1, pattern)[0]);
Console.WriteLine(Regex.Matches(s2, pattern)[0]);
Console.WriteLine(Regex.Matches(s3, pattern)[0]);
Console.WriteLine(Regex.Matches(s4, pattern)[0]);
Console.WriteLine(Regex.Matches(s5, pattern)[0]);

class Program
{
static void Main(string[] args)
{
string example = "/TEST/TEST123";
var result = GetFirstItem(example);
Console.WriteLine("First in the list : {0}", result);
}
static string GetFirstItem(string value)
{
var collection = value?.Split(new char[] { '/' }, StringSplitOptions.RemoveEmptyEntries);
var result = collection[0];
return result;
}
}
StringSplitOptions.RemoveEmptyEntries is an enum which tells the Split function that when it has split the string into an array, if there are elements in the array that are empty strings, the function should not include the empty elements in the results. Basically you want the collection to contain only values.

public string functionName(string input)
{
if(input.Contains('/'))
{
string SplitStr = input.Split('/')[1];
return "/"+SplitStr .Substring(0, 1) +SplitStr.Substring(1).ToLower()+"/"
}
return input;
}

output = (output.Contains("/"))? '/' +input.Split('/')[1]+'/':input;

private void button1_Click(object sender, EventArgs e)
{
string test = #"/Text/12121/1212/";
int first = test.IndexOf("/");
int last = test.Substring(first+1).IndexOf("/");
string finall = test.Substring(first, last+2);
}
i try this code with all your examples and get correct output. try this.

The following method may help you.
public string getValue(string st)
{
if (st.IndexOf('/') == -1)
return st;
return "/" + st.Split('/')[1] + "/";
}

C#: Get first directory name of a relative path

How to get the first directory name in a relative path, given that they can be different accepted directory separators?
For example:
foo\bar\abc.txt -> foo
bar/foo/foobar -> bar

Works with both forward and back slash
static string GetRootFolder(string path)
{
while (true)
{
string temp = Path.GetDirectoryName(path);
if (String.IsNullOrEmpty(temp))
break;
path = temp;
}
return path;
}

Seems like you could just use the string.Split() method on the string, then grab the first element.
example (untested):
string str = "foo\bar\abc.txt";
string str2 = "bar/foo/foobar";
string[] items = str.split(new char[] {'/', '\'}, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(items[0]); // prints "foo"
items = str2.split(new char[] {'/', '\'}, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(items[0]); // prints "bar"

The most robust solution would be to use DirectoryInfo and FileInfo. On a Windows NT-based system it should accept either forward or backslashes for separators.
using System;
using System.IO;
internal class Program
{
private static void Main(string[] args)
{
Console.WriteLine(GetTopRelativeFolderName(#"foo\bar\abc.txt")); // prints 'foo'
Console.WriteLine(GetTopRelativeFolderName("bar/foo/foobar")); // prints 'bar'
Console.WriteLine(GetTopRelativeFolderName("C:/full/rooted/path")); // ** throws
}
private static string GetTopRelativeFolderName(string relativePath)
{
if (Path.IsPathRooted(relativePath))
{
throw new ArgumentException("Path is not relative.", "relativePath");
}
FileInfo fileInfo = new FileInfo(relativePath);
DirectoryInfo workingDirectoryInfo = new DirectoryInfo(".");
string topRelativeFolderName = string.Empty;
DirectoryInfo current = fileInfo.Directory;
bool found = false;
while (!found)
{
if (current.FullName == workingDirectoryInfo.FullName)
{
found = true;
}
else
{
topRelativeFolderName = current.Name;
current = current.Parent;
}
}
return topRelativeFolderName;
}
}

Based on the answer provided by Hasan Khan ...
private static string GetRootFolder(string path)
{
var root = Path.GetPathRoot(path);
while (true)
{
var temp = Path.GetDirectoryName(path);
if (temp != null && temp.Equals(root))
break;
path = temp;
}
return path;
}
This will give the the top level folder

Based on the question you ask, the following should work:
public string GetTopLevelDir(string filePath)
{
string temp = Path.GetDirectoryName(filePath);
if(temp.Contains("\\"))
{
temp = temp.Substring(0, temp.IndexOf("\\"));
}
else if (temp.Contains("//"))
{
temp = temp.Substring(0, temp.IndexOf("\\"));
}
return temp;
}
When passed foo\bar\abc.txt it will foo as wanted- same for the / case

Here is another example in case your path if following format:
string path = "c:\foo\bar\abc.txt"; // or c:/foo/bar/abc.txt
string root = Path.GetPathRoot(path); // root == c:\

This should work fine
string str = "foo\bar\abc.txt";
string str2 = "bar/foo/foobar";
str.Replace("/", "\\").Split('\\').First(); // foo
str2.Replace("/", "\\").Split('\\').First(); // bar

Here my example, with no memory footprint (without creating new strings in memory):
var slashIndex = relativePath.IndexOf('/');
var backslashIndex = relativePath.IndexOf('\\');
var firstSlashIndex = (slashIndex > 0) ? (slashIndex < backslashIndex ? slashIndex : (backslashIndex == -1) ? slashIndex : backslashIndex) : backslashIndex;
var rootDirectory = relativePath.Substring(0, firstSlashIndex);

String Matching

I have a string
String mainString="///BUY/SELL///ORDERTIME///RT///QTY///BROKERAGE///NETRATE///AMOUNTRS///RATE///SCNM///";
Now I have another strings
String str1= "RT";
which should be matched only with RT which is substring of string mainString but not with ORDERTIME which is also substring of string mainString.
String str2= "RATE" ;
And RATE(str2) should be matched with RATE which is substring of string mainString but not with NETRATE which is also substring of string mainString.
How can we do that ?

Match against "///RT///" and "///RATE///".

This might give you some clues - no where near real code quality, and only a 5 minute job to come with this shoddy solution but does do what you need. it smells lots be warned.
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
using System.Text;
namespace test {
class Program {
static void Main(string[] args) {
String mainString="//BUY/SELL//ORDERTIME//RT//QTY//BROKERAGE//NETRATE//AMOUNTRS//RATE//SCNM//";
Hashtable ht = createHashTable(mainString);
if (hasValue("RA", ht)) {
Console.WriteLine("Matched RA");
} else {
Console.WriteLine("Didnt Find RA");
}
if (hasValue("RATE", ht)) {
Console.WriteLine("Matched RATE");
}
Console.Read();
}
public static Hashtable createHashTable(string strToSplit) {
Hashtable ht = new Hashtable();
int iCount = 0;
string[] words = strToSplit.Split(new Char[] { '/', '/', '/' });
foreach (string word in words) {
ht.Add(iCount++, word);
}
return ht;
}
public static bool hasValue(string strValuetoSearch, Hashtable ht) {
return ht.ContainsValue(strValuetoSearch);
}
}
}

as far as I understand your question you want to match a string between /// as delimiters.
if you look for str you just have to do
Regex.Match(mainString, "(^|///)" + str + "(///|$)");

I don't know it will work every time or not.But I have tried this and it works right now in this string matching. I want to know whether this is ok or not,please give me suggestion.
str1 = str1.Insert(0, "///");
str1=str1.Insert(str1.Length,"///");
bool Result = mainString.Contains(str1);

What about Linq to Object?
String mainString="///BUY/SELL///ORDERTIME///RT///QTY///BROKERAGE///NETRATE///AMOUNTRS///RATE///SCNM///";
String searchTerm = "RT";
String[] src = mainString.split('///');
var match = from word in src where
word.ToLowerInvariant() == searchTerm.ToLowerInvariant()
select word;
I don't have VS near me so I can't test it, but it should be something similar to this.

Searching Specific Data From a File

I have a File having text and few numbers.I just want to extract numbers from it.How do I go about it ???
I tried using all that split thing but no luck so far.
My File is like this:
AT+CMGL="ALL"
+CMGL: 5566,"REC READ","Ufone"
Dear customer, your DAY_BUCKET subscription will expire on 02/05/09
+CMGL: 5565,"REC READ","+923466666666"
KINDLY TELL ME THE WAY TO EXTRACT NUMBERS LIKE +923466666666 from this File so I can put them into another File or textbox.
Thanks

Here's an example using the String.Split. The "number" contains a '+', so really it should be treated as a string not a number. I'm presuming it's a telephone number with the '+' potentially used for international calls? If it is a telephone number, you need to be careful of dashes, spaces in the number as well as extension numbers added to the end eg "+9234 666-66666 ext 235" and so on...
Anyway - hopefully the example is useful in getting to grips with Split.
The code include unit tests using NUnit v2.4.8
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using NUnit.Framework;
using System.Text.RegularExpressions;
namespace SO.NumberExtractor.Test
{
public class NumberExtracter
{
public List<string> ExtractNumbers(string lines)
{
List<string> numbers = new List<string>();
string[] seperator = { System.Environment.NewLine };
string[] seperatedLines = lines.Split(seperator, StringSplitOptions.RemoveEmptyEntries);
foreach (string line in seperatedLines)
{
string s = ExtractNumber(line);
numbers.Add(s);
}
return numbers;
}
public string ExtractNumber(string line)
{
string s = line.Split(',').Last<string>().Trim('"');
return s;
}
public string ExtractNumberWithoutLinq(string line)
{
string[] fields = line.Split(',');
string s = fields[fields.Length - 1];
s = s.Trim('"');
return s;
}
}
[TestFixture]
public class NumberExtracterTest
{
private readonly string LINE1 = "AT+CMGL=\"ALL\" +CMGL: 5566,\"REC READ\",\"Ufone\" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,\"REC READ\",\"+923466666666\"";
private readonly string LINE2 = "AT+CMGL=\"ALL\" +CMGL: 5566,\"REC READ\",\"Ufone\" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,\"REC READ\",\"+923466666667\"";
private readonly string LINE3 = "AT+CMGL=\"ALL\" +CMGL: 5566,\"REC READ\",\"Ufone\" Dear customer, your DAY_BUCKET subscription will expire on 02/05/09 +CMGL: 5565,\"REC READ\",\"+923466666668\"";
[Test]
public void ExtractOneLineWithoutLinq()
{
string expected = "+923466666666";
NumberExtracter c = new NumberExtracter();
string result = c.ExtractNumberWithoutLinq(LINE1);
Assert.AreEqual(expected, result);
}
[Test]
public void ExtractOneLineUsingLinq()
{
string expected = "+923466666666";
NumberExtracter c = new NumberExtracter();
string result = c.ExtractNumber(LINE1);
Assert.AreEqual(expected, result);
}
[Test]
public void ExtractMultipleLines()
{
StringBuilder sb = new StringBuilder();
sb.AppendLine(LINE1);
sb.AppendLine(LINE2);
sb.AppendLine(LINE3);
NumberExtracter ne = new NumberExtracter();
List<string> extractedNumbers = ne.ExtractNumbers(sb.ToString());
string expectedFirst = "+923466666666";
string expectedSecond = "+923466666667";
string expectedThird = "+923466666668";
Assert.AreEqual(expectedFirst, extractedNumbers[0]);
Assert.AreEqual(expectedSecond, extractedNumbers[1]);
Assert.AreEqual(expectedThird, extractedNumbers[2]);
}
}
}

If the numbers are all at the end of the lines then you can use code like the following
foreach ( string line in File.ReadAllLines(#"c:\path\to\file.txt") ) {
Match result = Regex.Match(line, #"\+(\d+)""$");
if ( result.Success ) {
var number = result.Groups[1].Value;
// do what you want with the number
}
}

How large is the file? If the file is under a few megabytes in size I would recommend loading the file contents into a string and using a compiled regular expression to extract matches.
Here's a quick example:
Regex NumberExtractor = new Regex("[0-9]{7,16}",RegexOptions.Compiled);
/// <summary>
/// Extracts numbers between seven and sixteen digits long from the target file.
/// Example number to be extracted: +923466666666
/// </summary>
/// <param name="TargetFilePath"></param>
/// <returns>List of the matching numbers</returns>
private IEnumerable<ulong> ExtractLongNumbersFromFile(string TargetFilePath)
{
if (String.IsNullOrEmpty(TargetFilePath))
throw new ArgumentException("TargetFilePath is null or empty.", "TargetFilePath");
if (File.Exists(TargetFilePath) == false)
throw new Exception("Target file does not exist!");
FileStream TargetFileStream = null;
StreamReader TargetFileStreamReader = null;
string FileContents = "";
List<ulong> ReturnList = new List<ulong>();
try
{
TargetFileStream = new FileStream(TargetFilePath, FileMode.Open);
TargetFileStreamReader = new StreamReader(TargetFileStream);
FileContents = TargetFileStreamReader.ReadToEnd();
MatchCollection Matches = NumberExtractor.Matches(FileContents);
foreach (Match CurrentMatch in Matches) {
ReturnList.Add(System.Convert.ToUInt64(CurrentMatch.Value));
}
}
catch (Exception ex)
{
//Your logging, etc...
}
finally
{
if (TargetFileStream != null) {
TargetFileStream.Close();
TargetFileStream.Dispose();
}
if (TargetFileStreamReader != null)
{
TargetFileStreamReader.Dispose();
}
}
return (IEnumerable<ulong>)ReturnList;
}
Sample Usage:
List<ulong> Numbers = (List<ulong>)ExtractLongNumbersFromFile(#"v:\TestExtract.txt");

Path.Combine for URLs?

Path.Combine is handy, but is there a similar function in the .NET framework for URLs?
I'm looking for syntax like this:
Url.Combine("http://MyUrl.com/", "/Images/Image.jpg")
which would return:
"http://MyUrl.com/Images/Image.jpg"

Uri has a constructor that should do this for you: new Uri(Uri baseUri, string relativeUri)
Here's an example:
Uri baseUri = new Uri("http://www.contoso.com");
Uri myUri = new Uri(baseUri, "catalog/shownew.htm");
Note from editor: Beware, this method does not work as expected. It can cut part of baseUri in some cases. See comments and other answers.

This may be a suitably simple solution:
public static string Combine(string uri1, string uri2)
{
uri1 = uri1.TrimEnd('/');
uri2 = uri2.TrimStart('/');
return string.Format("{0}/{1}", uri1, uri2);
}

There's already some great answers here. Based on mdsharpe suggestion, here's an extension method that can easily be used when you want to deal with Uri instances:
using System;
using System.Linq;
public static class UriExtensions
{
public static Uri Append(this Uri uri, params string[] paths)
{
return new Uri(paths.Aggregate(uri.AbsoluteUri, (current, path) => string.Format("{0}/{1}", current.TrimEnd('/'), path.TrimStart('/'))));
}
}
And usage example:
var url = new Uri("http://example.com/subpath/").Append("/part1/", "part2").AbsoluteUri;
This will produce http://example.com/subpath/part1/part2
If you want to work with strings instead of Uris then the following will also produce the same result, simply adapt it to suit your needs:
public string JoinUriSegments(string uri, params string[] segments)
{
if (string.IsNullOrWhiteSpace(uri))
return null;
if (segments == null || segments.Length == 0)
return uri;
return segments.Aggregate(uri, (current, segment) => $"{current.TrimEnd('/')}/{segment.TrimStart('/')}");
}
var uri = JoinUriSegements("http://example.com/subpath/", "/part1/", "part2");

You use Uri.TryCreate( ... ) :
Uri result = null;
if (Uri.TryCreate(new Uri("http://msdn.microsoft.com/en-us/library/"), "/en-us/library/system.uri.trycreate.aspx", out result))
{
Console.WriteLine(result);
}
Will return:
http://msdn.microsoft.com/en-us/library/system.uri.trycreate.aspx

There is a Todd Menier's comment above that Flurl includes a Url.Combine.
More details:
Url.Combine is basically a Path.Combine for URLs, ensuring one
and only one separator character between parts:
var url = Url.Combine(
"http://MyUrl.com/",
"/too/", "/many/", "/slashes/",
"too", "few?",
"x=1", "y=2"
// result: "http://www.MyUrl.com/too/many/slashes/too/few?x=1&y=2"
Get Flurl.Http on NuGet:
PM> Install-Package Flurl.Http
Or get the stand-alone URL builder without the HTTP features:
PM> Install-Package Flurl

Ryan Cook's answer is close to what I'm after and may be more appropriate for other developers. However, it adds http:// to the beginning of the string and in general it does a bit more formatting than I'm after.
Also, for my use cases, resolving relative paths is not important.
mdsharp's answer also contains the seed of a good idea, although that actual implementation needed a few more details to be complete. This is an attempt to flesh it out (and I'm using this in production):
C#
public string UrlCombine(string url1, string url2)
{
if (url1.Length == 0) {
return url2;
}
if (url2.Length == 0) {
return url1;
}
url1 = url1.TrimEnd('/', '\\');
url2 = url2.TrimStart('/', '\\');
return string.Format("{0}/{1}", url1, url2);
}
VB.NET
Public Function UrlCombine(ByVal url1 As String, ByVal url2 As String) As String
If url1.Length = 0 Then
Return url2
End If
If url2.Length = 0 Then
Return url1
End If
url1 = url1.TrimEnd("/"c, "\"c)
url2 = url2.TrimStart("/"c, "\"c)
Return String.Format("{0}/{1}", url1, url2)
End Function
This code passes the following test, which happens to be in VB:
<TestMethod()> Public Sub UrlCombineTest()
Dim target As StringHelpers = New StringHelpers()
Assert.IsTrue(target.UrlCombine("test1", "test2") = "test1/test2")
Assert.IsTrue(target.UrlCombine("test1/", "test2") = "test1/test2")
Assert.IsTrue(target.UrlCombine("test1", "/test2") = "test1/test2")
Assert.IsTrue(target.UrlCombine("test1/", "/test2") = "test1/test2")
Assert.IsTrue(target.UrlCombine("/test1/", "/test2/") = "/test1/test2/")
Assert.IsTrue(target.UrlCombine("", "/test2/") = "/test2/")
Assert.IsTrue(target.UrlCombine("/test1/", "") = "/test1/")
End Sub

Path.Combine does not work for me because there can be characters like "|" in QueryString arguments and therefore the URL, which will result in an ArgumentException.
I first tried the new Uri(Uri baseUri, string relativeUri) approach, which failed for me because of URIs like http://www.mediawiki.org/wiki/Special:SpecialPages:
new Uri(new Uri("http://www.mediawiki.org/wiki/"), "Special:SpecialPages")
will result in Special:SpecialPages, because of the colon after Special that denotes a scheme.
So I finally had to take mdsharpe/Brian MacKays route and developed it a bit further to work with multiple URI parts:
public static string CombineUri(params string[] uriParts)
{
string uri = string.Empty;
if (uriParts != null && uriParts.Length > 0)
{
char[] trims = new char[] { '\\', '/' };
uri = (uriParts[0] ?? string.Empty).TrimEnd(trims);
for (int i = 1; i < uriParts.Length; i++)
{
uri = string.Format("{0}/{1}", uri.TrimEnd(trims), (uriParts[i] ?? string.Empty).TrimStart(trims));
}
}
return uri;
}
Usage: CombineUri("http://www.mediawiki.org/", "wiki", "Special:SpecialPages")

Based on the sample URL you provided, I'm going to assume you want to combine URLs that are relative to your site.
Based on this assumption I'll propose this solution as the most appropriate response to your question which was: "Path.Combine is handy, is there a similar function in the framework for URLs?"
Since there the is a similar function in the framework for URLs I propose the correct is: "VirtualPathUtility.Combine" method.
Here's the MSDN reference link: VirtualPathUtility.Combine Method
There is one caveat: I believe this only works for URLs relative to your site (that is, you cannot use it to generate links to another web site. For example, var url = VirtualPathUtility.Combine("www.google.com", "accounts/widgets");).

Path.Combine("Http://MyUrl.com/", "/Images/Image.jpg").Replace("\\", "/")

I just put together a small extension method:
public static string UriCombine (this string val, string append)
{
if (String.IsNullOrEmpty(val)) return append;
if (String.IsNullOrEmpty(append)) return val;
return val.TrimEnd('/') + "/" + append.TrimStart('/');
}
It can be used like this:
"www.example.com/".UriCombine("/images").UriCombine("first.jpeg");

Witty example, Ryan, to end with a link to the function. Well done.
One recommendation Brian: if you wrap this code in a function, you may want to use a UriBuilder to wrap the base URL prior to the TryCreate call.
Otherwise, the base URL MUST include the scheme (where the UriBuilder will assume http://). Just a thought:
public string CombineUrl(string baseUrl, string relativeUrl) {
UriBuilder baseUri = new UriBuilder(baseUrl);
Uri newUri;
if (Uri.TryCreate(baseUri.Uri, relativeUrl, out newUri))
return newUri.ToString();
else
throw new ArgumentException("Unable to combine specified url values");
}

An easy way to combine them and ensure it's always correct is:
string.Format("{0}/{1}", Url1.Trim('/'), Url2);

I think this should give you more flexibility as you can deal with as many path segments as you want:
public static string UrlCombine(this string baseUrl, params string[] segments)
=> string.Join("/", new[] { baseUrl.TrimEnd('/') }.Concat(segments.Select(s => s.Trim('/'))));

Combining multiple parts of a URL could be a little bit tricky. You can use the two-parameter constructor Uri(baseUri, relativeUri), or you can use the Uri.TryCreate() utility function.
In either case, you might end up returning an incorrect result because these methods keep on truncating the relative parts off of the first parameter baseUri, i.e. from something like http://google.com/some/thing to http://google.com.
To be able to combine multiple parts into a final URL, you can copy the two functions below:
public static string Combine(params string[] parts)
{
if (parts == null || parts.Length == 0) return string.Empty;
var urlBuilder = new StringBuilder();
foreach (var part in parts)
{
var tempUrl = tryCreateRelativeOrAbsolute(part);
urlBuilder.Append(tempUrl);
}
return VirtualPathUtility.RemoveTrailingSlash(urlBuilder.ToString());
}
private static string tryCreateRelativeOrAbsolute(string s)
{
System.Uri uri;
System.Uri.TryCreate(s, UriKind.RelativeOrAbsolute, out uri);
string tempUrl = VirtualPathUtility.AppendTrailingSlash(uri.ToString());
return tempUrl;
}
Full code with unit tests to demonstrate usage can be found at https://uricombine.codeplex.com/SourceControl/latest#UriCombine/Uri.cs
I have unit tests to cover the three most common cases:

As found in other answers, either new Uri() or TryCreate() can do the tick.
However, the base Uri has to end with / and the relative has to NOT begin with /; otherwise it will remove the trailing part of the base Url
I think this is best done as an extension method, i.e.
public static Uri Append(this Uri uri, string relativePath)
{
var baseUri = uri.AbsoluteUri.EndsWith('/') ? uri : new Uri(uri.AbsoluteUri + '/');
var relative = relativePath.StartsWith('/') ? relativePath.Substring(1) : relativePath;
return new Uri(baseUri, relative);
}
and to use it:
var baseUri = new Uri("http://test.com/test/");
var combinedUri = baseUri.Append("/Do/Something");
In terms of performance, this consumes more resources than it needs, because of the Uri class which does a lot of parsing and validation; a very rough profiling (Debug) did a million operations in about 2 seconds.
This will work for most scenarios, however to be more efficient, it's better to manipulate everything as strings, this takes 125 milliseconds for 1 million operations.
I.e.
public static string Append(this Uri uri, string relativePath)
{
//avoid the use of Uri as it's not needed, and adds a bit of overhead.
var absoluteUri = uri.AbsoluteUri; //a calculated property, better cache it
var baseUri = absoluteUri.EndsWith('/') ? absoluteUri : absoluteUri + '/';
var relative = relativePath.StartsWith('/') ? relativePath.Substring(1) : relativePath;
return baseUri + relative;
}
And if you still want to return a URI, it takes around 600 milliseconds for 1 million operations.
public static Uri AppendUri(this Uri uri, string relativePath)
{
//avoid the use of Uri as it's not needed, and adds a bit of overhead.
var absoluteUri = uri.AbsoluteUri; //a calculated property, better cache it
var baseUri = absoluteUri.EndsWith('/') ? absoluteUri : absoluteUri + '/';
var relative = relativePath.StartsWith('/') ? relativePath.Substring(1) : relativePath;
return new Uri(baseUri + relative);
}
I hope this helps.

I found UriBuilder worked really well for this sort of thing:
UriBuilder urlb = new UriBuilder("http", _serverAddress, _webPort, _filePath);
Uri url = urlb.Uri;
return url.AbsoluteUri;
See UriBuilder Class - MSDN for more constructors and documentation.

If you don't want to have a dependency like Flurl, you can use its source code:
/// <summary>
/// Basically a Path.Combine for URLs. Ensures exactly one '/' separates each segment,
/// and exactly on '&' separates each query parameter.
/// URL-encodes illegal characters but not reserved characters.
/// </summary>
/// <param name="parts">URL parts to combine.</param>
public static string Combine(params string[] parts) {
if (parts == null)
throw new ArgumentNullException(nameof(parts));
string result = "";
bool inQuery = false, inFragment = false;
string CombineEnsureSingleSeparator(string a, string b, char separator) {
if (string.IsNullOrEmpty(a)) return b;
if (string.IsNullOrEmpty(b)) return a;
return a.TrimEnd(separator) + separator + b.TrimStart(separator);
}
foreach (var part in parts) {
if (string.IsNullOrEmpty(part))
continue;
if (result.EndsWith("?") || part.StartsWith("?"))
result = CombineEnsureSingleSeparator(result, part, '?');
else if (result.EndsWith("#") || part.StartsWith("#"))
result = CombineEnsureSingleSeparator(result, part, '#');
else if (inFragment)
result += part;
else if (inQuery)
result = CombineEnsureSingleSeparator(result, part, '&');
else
result = CombineEnsureSingleSeparator(result, part, '/');
if (part.Contains("#")) {
inQuery = false;
inFragment = true;
}
else if (!inFragment && part.Contains("?")) {
inQuery = true;
}
}
return EncodeIllegalCharacters(result);
}
/// <summary>
/// URL-encodes characters in a string that are neither reserved nor unreserved. Avoids encoding reserved characters such as '/' and '?'. Avoids encoding '%' if it begins a %-hex-hex sequence (i.e. avoids double-encoding).
/// </summary>
/// <param name="s">The string to encode.</param>
/// <param name="encodeSpaceAsPlus">If true, spaces will be encoded as + signs. Otherwise, they'll be encoded as %20.</param>
/// <returns>The encoded URL.</returns>
public static string EncodeIllegalCharacters(string s, bool encodeSpaceAsPlus = false) {
if (string.IsNullOrEmpty(s))
return s;
if (encodeSpaceAsPlus)
s = s.Replace(" ", "+");
// Uri.EscapeUriString mostly does what we want - encodes illegal characters only - but it has a quirk
// in that % isn't illegal if it's the start of a %-encoded sequence https://stackoverflow.com/a/47636037/62600
// no % characters, so avoid the regex overhead
if (!s.Contains("%"))
return Uri.EscapeUriString(s);
// pick out all %-hex-hex matches and avoid double-encoding
return Regex.Replace(s, "(.*?)((%[0-9A-Fa-f]{2})|$)", c => {
var a = c.Groups[1].Value; // group 1 is a sequence with no %-encoding - encode illegal characters
var b = c.Groups[2].Value; // group 2 is a valid 3-character %-encoded sequence - leave it alone!
return Uri.EscapeUriString(a) + b;
});
}

I find the following useful and has the following features :
Throws on null or white space
Takes multiple params parameter for multiple Url segments
throws on null or empty
Class
public static class UrlPath
{
private static string InternalCombine(string source, string dest)
{
if (string.IsNullOrWhiteSpace(source))
throw new ArgumentException("Cannot be null or white space", nameof(source));
if (string.IsNullOrWhiteSpace(dest))
throw new ArgumentException("Cannot be null or white space", nameof(dest));
return $"{source.TrimEnd('/', '\\')}/{dest.TrimStart('/', '\\')}";
}
public static string Combine(string source, params string[] args)
=> args.Aggregate(source, InternalCombine);
}
Tests
UrlPath.Combine("test1", "test2");
UrlPath.Combine("test1//", "test2");
UrlPath.Combine("test1", "/test2");
// Result = test1/test2
UrlPath.Combine(#"test1\/\/\/", #"\/\/\\\\\//test2", #"\/\/\\\\\//test3\") ;
// Result = test1/test2/test3
UrlPath.Combine("/test1/", "/test2/", null);
UrlPath.Combine("", "/test2/");
UrlPath.Combine("/test1/", null);
// Throws an ArgumentException

So I have another approach, similar to everyone who used UriBuilder.
I did not want to split my BaseUrl (which can contain a part of the path - e.g. http://mybaseurl.com/dev/) as javajavajavajavajava did.
The following snippet shows the code + Tests.
Beware: This solution lowercases the host and appends a port. If this is not desired, one can write a string representation by e.g. leveraging the Uri Property of UriBuilder.
public class Tests
{
public static string CombineUrl (string baseUrl, string path)
{
var uriBuilder = new UriBuilder (baseUrl);
uriBuilder.Path = Path.Combine (uriBuilder.Path, path);
return uriBuilder.ToString();
}
[TestCase("http://MyUrl.com/", "/Images/Image.jpg", "http://myurl.com:80/Images/Image.jpg")]
[TestCase("http://MyUrl.com/basePath", "/Images/Image.jpg", "http://myurl.com:80/Images/Image.jpg")]
[TestCase("http://MyUrl.com/basePath", "Images/Image.jpg", "http://myurl.com:80/basePath/Images/Image.jpg")]
[TestCase("http://MyUrl.com/basePath/", "Images/Image.jpg", "http://myurl.com:80/basePath/Images/Image.jpg")]
public void Test1 (string baseUrl, string path, string expected)
{
var result = CombineUrl (baseUrl, path);
Assert.That (result, Is.EqualTo (expected));
}
}
Tested with .NET Core 2.1 on Windows 10.
Why does this work?
Even though Path.Combine will return Backslashes (on Windows atleast), the UriBuilder handles this case in the Setter of Path.
Taken from https://github.com/dotnet/corefx/blob/master/src/System.Private.Uri/src/System/UriBuilder.cs (mind the call to string.Replace)
[AllowNull]
public string Path
{
get
{
return _path;
}
set
{
if ((value == null) || (value.Length == 0))
{
value = "/";
}
_path = Uri.InternalEscapeString(value.Replace('\\', '/'));
_changed = true;
}
}
Is this the best approach?
Certainly this solution is pretty self describing (at least in my opinion). But you are relying on undocumented (at least I found nothing with a quick google search) "feature" from the .NET API. This may change with a future release so please cover the Method with Tests.
There are tests in https://github.com/dotnet/corefx/blob/master/src/System.Private.Uri/tests/FunctionalTests/UriBuilderTests.cs (Path_Get_Set) which check, if the \ is correctly transformed.
Side Note: One could also work with the UriBuilder.Uri property directly, if the uri will be used for a System.Uri ctor.

For anyone who is looking for a one-liner and simply wants to join parts of a path without creating a new method or referencing a new library or construct a URI value and convert that to a string, then...
string urlToImage = String.Join("/", "websiteUrl", "folder1", "folder2", "folder3", "item");
It's pretty basic, but I don't see what more you need. If you're afraid of doubled '/' then you can simply do a .Replace("//", "/") afterward. If you're afraid of replacing the doubled '//' in 'https://', then instead do one join, replace the doubled '/', then join the website url (however I'm pretty sure most browsers will automatically convert anything with 'https:' in the front of it to read in the correct format). This would look like:
string urlToImage = String.Join("/","websiteUrl", String.Join("/", "folder1", "folder2", "folder3", "item").Replace("//","/"));
There are plenty of answers here that will handle all the above, but in my case, I only needed it once in one location and won't need to heavily rely on it. Also, it's really easy to see what is going on here.
See: https://learn.microsoft.com/en-us/dotnet/api/system.string.join?view=netframework-4.8

My generic solution:
public static string Combine(params string[] uriParts)
{
string uri = string.Empty;
if (uriParts != null && uriParts.Any())
{
char[] trims = new char[] { '\\', '/' };
uri = (uriParts[0] ?? string.Empty).TrimEnd(trims);
for (int i = 1; i < uriParts.Length; i++)
{
uri = string.Format("{0}/{1}", uri.TrimEnd(trims), (uriParts[i] ?? string.Empty).TrimStart(trims));
}
}
return uri;
}

Here's Microsoft's (OfficeDev PnP) method UrlUtility.Combine:
const char PATH_DELIMITER = '/';
/// <summary>
/// Combines a path and a relative path.
/// </summary>
/// <param name="path"></param>
/// <param name="relative"></param>
/// <returns></returns>
public static string Combine(string path, string relative)
{
if(relative == null)
relative = String.Empty;
if(path == null)
path = String.Empty;
if(relative.Length == 0 && path.Length == 0)
return String.Empty;
if(relative.Length == 0)
return path;
if(path.Length == 0)
return relative;
path = path.Replace('\\', PATH_DELIMITER);
relative = relative.Replace('\\', PATH_DELIMITER);
return path.TrimEnd(PATH_DELIMITER) + PATH_DELIMITER + relative.TrimStart(PATH_DELIMITER);
}
Source: GitHub

// Read all above samples and as result created my self:
static string UrlCombine(params string[] items)
{
if (items?.Any() != true)
{
return string.Empty;
}
return string.Join("/", items.Where(u => !string.IsNullOrWhiteSpace(u)).Select(u => u.Trim('/', '\\')));
}
// usage
UrlCombine("https://microsoft.com","en-us")

I have an allocation-free string creation version that I've been using with great success.
NOTE:
For the first string: it trims the separator using TrimEnd(separator) - so only from the end of the string.
For the remainders: it trims the separator using Trim(separator) - so both start and end of paths
It does not append a trailing slash/separator. Though a simple modification can be done to add this ability.
Hope you find this useful!
/// <summary>
/// This implements an allocation-free string creation to construct the path.
/// This uses 3.5x LESS memory and is 2x faster than some alternate methods (StringBuilder, interpolation, string.Concat, etc.).
/// </summary>
/// <param name="str"></param>
/// <param name="paths"></param>
/// <returns></returns>
public static string ConcatPath(this string str, params string[] paths)
{
const char separator = '/';
if (str == null) throw new ArgumentNullException(nameof(str));
var list = new List<ReadOnlyMemory<char>>();
var first = str.AsMemory().TrimEnd(separator);
// get length for intial string after it's trimmed
var length = first.Length;
list.Add(first);
foreach (var path in paths)
{
var newPath = path.AsMemory().Trim(separator);
length += newPath.Length + 1;
list.Add(newPath);
}
var newString = string.Create(length, list, (chars, state) =>
{
// NOTE: We don't access the 'list' variable in this delegate since
// it would cause a closure and allocation. Instead we access the state parameter.
// track our position within the string data we are populating
var position = 0;
// copy the first string data to index 0 of the Span<char>
state[0].Span.CopyTo(chars);
// update the position to the new length
position += state[0].Span.Length;
// start at index 1 when slicing
for (var i = 1; i < state.Count; i++)
{
// add a separator in the current position and increment position by 1
chars[position++] = separator;
// copy each path string to a slice at current position
state[i].Span.CopyTo(chars.Slice(position));
// update the position to the new length
position += state[i].Length;
}
});
return newString;
}
with Benchmark DotNet output:
| Method | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Allocated |
|---------------------- |---------:|---------:|---------:|---------:|------:|--------:|-------:|----------:|
| ConcatPathWithBuilder | 404.1 ns | 27.35 ns | 78.48 ns | 380.3 ns | 1.00 | 0.00 | 0.3347 | 1,400 B |
| ConcatPath | 187.2 ns | 5.93 ns | 16.44 ns | 183.2 ns | 0.48 | 0.10 | 0.0956 | 400 B |

A simple one liner:
public static string Combine(this string uri1, string uri2) => $"{uri1.TrimEnd('/')}/{uri2.TrimStart('/')}";
Inspired by #Matt Sharpe's answer.

Here is my approach and I will use it for myself too:
public static string UrlCombine(string part1, string part2)
{
string newPart1 = string.Empty;
string newPart2 = string.Empty;
string seperator = "/";
// If either part1 or part 2 is empty,
// we don't need to combine with seperator
if (string.IsNullOrEmpty(part1) || string.IsNullOrEmpty(part2))
{
seperator = string.Empty;
}
// If part1 is not empty,
// remove '/' at last
if (!string.IsNullOrEmpty(part1))
{
newPart1 = part1.TrimEnd('/');
}
// If part2 is not empty,
// remove '/' at first
if (!string.IsNullOrEmpty(part2))
{
newPart2 = part2.TrimStart('/');
}
// Now finally combine
return string.Format("{0}{1}{2}", newPart1, seperator, newPart2);
}

I created this function that will make your life easier:
/// <summary>
/// The ultimate Path combiner of all time
/// </summary>
/// <param name="IsURL">
/// true - if the paths are Internet URLs, false - if the paths are local URLs, this is very important as this will be used to decide which separator will be used.
/// </param>
/// <param name="IsRelative">Just adds the separator at the beginning</param>
/// <param name="IsFixInternal">Fix the paths from within (by removing duplicate separators and correcting the separators)</param>
/// <param name="parts">The paths to combine</param>
/// <returns>the combined path</returns>
public static string PathCombine(bool IsURL , bool IsRelative , bool IsFixInternal , params string[] parts)
{
if (parts == null || parts.Length == 0) return string.Empty;
char separator = IsURL ? '/' : '\\';
if (parts.Length == 1 && IsFixInternal)
{
string validsingle;
if (IsURL)
{
validsingle = parts[0].Replace('\\' , '/');
}
else
{
validsingle = parts[0].Replace('/' , '\\');
}
validsingle = validsingle.Trim(separator);
return (IsRelative ? separator.ToString() : string.Empty) + validsingle;
}
string final = parts
.Aggregate
(
(string first , string second) =>
{
string validfirst;
string validsecond;
if (IsURL)
{
validfirst = first.Replace('\\' , '/');
validsecond = second.Replace('\\' , '/');
}
else
{
validfirst = first.Replace('/' , '\\');
validsecond = second.Replace('/' , '\\');
}
var prefix = string.Empty;
if (IsFixInternal)
{
if (IsURL)
{
if (validfirst.Contains("://"))
{
var tofix = validfirst.Substring(validfirst.IndexOf("://") + 3);
prefix = validfirst.Replace(tofix , string.Empty).TrimStart(separator);
var tofixlist = tofix.Split(new[] { separator } , StringSplitOptions.RemoveEmptyEntries);
validfirst = separator + string.Join(separator.ToString() , tofixlist);
}
else
{
var firstlist = validfirst.Split(new[] { separator } , StringSplitOptions.RemoveEmptyEntries);
validfirst = string.Join(separator.ToString() , firstlist);
}
var secondlist = validsecond.Split(new[] { separator } , StringSplitOptions.RemoveEmptyEntries);
validsecond = string.Join(separator.ToString() , secondlist);
}
else
{
var firstlist = validfirst.Split(new[] { separator } , StringSplitOptions.RemoveEmptyEntries);
var secondlist = validsecond.Split(new[] { separator } , StringSplitOptions.RemoveEmptyEntries);
validfirst = string.Join(separator.ToString() , firstlist);
validsecond = string.Join(separator.ToString() , secondlist);
}
}
return prefix + validfirst.Trim(separator) + separator + validsecond.Trim(separator);
}
);
return (IsRelative ? separator.ToString() : string.Empty) + final;
}
It works for URLs as well as normal paths.
Usage:
// Fixes internal paths
Console.WriteLine(PathCombine(true , true , true , #"\/\/folder 1\/\/\/\\/\folder2\///folder3\\/" , #"/\somefile.ext\/\//\"));
// Result: /folder 1/folder2/folder3/somefile.ext
// Doesn't fix internal paths
Console.WriteLine(PathCombine(true , true , false , #"\/\/folder 1\/\/\/\\/\folder2\///folder3\\/" , #"/\somefile.ext\/\//\"));
//result : /folder 1//////////folder2////folder3/somefile.ext
// Don't worry about URL prefixes when fixing internal paths
Console.WriteLine(PathCombine(true , false , true , #"/\/\/https:/\/\/\lul.com\/\/\/\\/\folder2\///folder3\\/" , #"/\somefile.ext\/\//\"));
// Result: https://lul.com/folder2/folder3/somefile.ext
Console.WriteLine(PathCombine(false , true , true , #"../../../\\..\...\./../somepath" , #"anotherpath"));
// Result: \..\..\..\..\...\.\..\somepath\anotherpath

I found that the Uri constructor flips '\' into '/'. So you can also use Path.Combine, with the Uri constructor.
Uri baseUri = new Uri("http://MyUrl.com");
string path = Path.Combine("Images", "Image.jpg");
Uri myUri = new Uri(baseUri, path);

Why not just use the following.
System.IO.Path.Combine(rootUrl, subPath).Replace(#"\", "/")

For what it's worth, here a couple of extension methods. The first one will combine paths and the second one adds parameters to the URL.
public static string CombineUrl(this string root, string path, params string[] paths)
{
if (string.IsNullOrWhiteSpace(path))
{
return root;
}
Uri baseUri = new Uri(root);
Uri combinedPaths = new Uri(baseUri, path);
foreach (string extendedPath in paths)
{
combinedPaths = new Uri(combinedPaths, extendedPath);
}
return combinedPaths.AbsoluteUri;
}
public static string AddUrlParams(this string url, Dictionary<string, string> parameters)
{
if (parameters == null || !parameters.Keys.Any())
{
return url;
}
var tempUrl = new StringBuilder($"{url}?");
int count = 0;
foreach (KeyValuePair<string, string> parameter in parameters)
{
if (count > 0)
{
tempUrl.Append("&");
}
tempUrl.Append($"{WebUtility.UrlEncode(parameter.Key)}={WebUtility.UrlEncode(parameter.Value)}");
count++;
}
return tempUrl.ToString();
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Sanitize File Name - c#

A shorter solution: var invalids = System.IO.Path.GetInvalidFileNameChars(); var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');

string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));

there are a lot of working solutions here. just for the sake of completeness, here's an approach that doesn't use regex, but uses LINQ: var invalids = Path.GetInvalidFileNameChars(); filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_')); Also, it's a very short solution ;)

I'm using the System.IO.Path.GetInvalidFileNameChars() method to check invalid characters and I've got no problems. I'm using the following code: foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars()) { filename = filename.Replace(invalidchar, '_'); }

Related

Split special string in c#

C#: Get first directory name of a relative path

String Matching

Searching Specific Data From a File

Path.Combine for URLs?

Categories

Resources