Related
I have done research but my app downloads mp3 files every once in a while I get weird filename which doesn't hurt until I try to burn them to CD. Below is a good example.
The Animals - House of the Rising Sun (1964) + clip compilation ♫♥ 50 YEARS - counting.mp3
I have some code to try and catch illegal characters but it doesn't stop this filename. Is there a better way to catch the weird stuff the code I use currently is:
public static string RemoveIllegalFileNameChars(string input, string replacement = "")
{
if (input.Contains("?"))
{
input = input.Replace('?', char.Parse(" "));
}
if (input.Contains("&"))
{
input = input.Replace('&', char.Parse("-"));
}
var regexSearch = new string(Path.GetInvalidFileNameChars()) +
new string(Path.GetInvalidPathChars());
var r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
return r.Replace(input, replacement);
}
The CD file system is different to the OS file system, so those Path.GetInvalidX functions don't really apply to CDs.
I'm not sure, but possibly the standard you are looking at is ISO 9660
https://en.wikipedia.org/wiki/ISO_9660
Which has an extremely limited character set in filenames.
I think that Joliet extension to that standard must be in play:
https://en.wikipedia.org/wiki/Joliet_(file_system)
I think that maybe you are running into the filename length problem more than anything: "The specification only allows filenames to be up to 64 Unicode characters in length". Your filename is 90 characters long.
The following code will turn non-ascii characters into '?'
string sOut = Encoding.ASCII.GetString(Encoding.ASCII.GetBytes(s))
Then you can use a sOut.Replace('?', '') call to take them out. Does this seem like it would work for you?
Although, in this case your file name is valid, to catch invalid file names, it is suggested to use GetInvalidFileNameChars() method.
string fileName = "The Animals - House of the Rising Sun ? (1964) + clip compilation ♫♥ 50 YEARS - counting.mp3";
byte[] bytes = Encoding.ASCII.GetBytes(fileName);
char[] characters = Encoding.ASCII.GetChars(bytes);
string name = new string(characters);
StringBuilder fileN = new StringBuilder(name);
foreach (char c in Path.GetInvalidFileNameChars())
{
fileN.Replace(c, '_');
}
string validFileName = fileN.ToString();
https://learn.microsoft.com/en-us/dotnet/api/system.io.path.getinvalidfilenamechars?view=netframework-4.7.2
Thanks for all your help the final working code is listed below
public static string RemoveIllegalFileNameChars(string input, string replacement = "")
{
if (input.Contains("?"))
{
input = input.Replace('?', char.Parse(" "));
}
if (input.Contains("&"))
{
input = input.Replace('&', char.Parse("-"));
}
var regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
var r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
// check for non asccii characters
byte[] bytes = Encoding.ASCII.GetBytes(input);
char[] chars = Encoding.ASCII.GetChars(bytes);
string line = new String(chars);
line = line.Replace("?", "");
//MessageBox.Show(line);
return r.Replace(line, replacement);
}
i am assigning images[] with an array that holds images file names with full path of given Directory.
string[] images = DirLoad.FileNamesArray(
IO.Loaders.PathType.full,
IO.Loaders.FileExtension.jpg
);
...now, that images[] stores all the file names i need, as I had to use the full path to get it done,
using Directory.GetFiles()
Next action requires it as a local file name.
(each is then passed as string type parameter to another method)
so my question is :
How can i omit first part - HttpRuntime.AppDomainAppPath ...if it's same in every element of array ?
this is usage example, the string is currentDir i need to trim from each element in images[]
public class IO
{
public class Loaders
{
readonly string currentDir = HttpRuntime.AppDomainAppPath;
public string selecedDirName { get; set; }
/// <summary>
/// assign The Loaders.selectedDir First before calling
/// </summary>
/// <param name="foldertoLoad"></param>
/// <returns></returns>
public enum PathType
{
full, local
}
public enum FileExtension
{
jpg,png,txt,xml,htm,js,aspx,css
}
public string[] FileNamesArray(PathType SelectedPathMode, FileExtension selectedfileType)
{
string thisFolder = "";
string thatFileType= string.Format("*.{0}",selectedfileType.ToString());
switch (SelectedPathMode)
{
case PathType.full:
thisFolder = Path.Combine(currentDir, selecedDirName);
break;
case PathType.local:
thisFolder = selecedDirName;
break;
default:
break;
}
string[] foundArr = Directory.GetFiles(thisFolder, thatFileType);
return foundArr;
}
}
}
Update , this is what i've tried
string fileName;
string[] images = DirLoad.FilesArray(IO.Loaders.PathType.full, IO.Loaders.FileExtention.jpg);
foreach (var currImage in images)
{
int startingAt = DirLoad.currentDir.Length ;
int finalPoint = currImage.Length - startingAt;
fileName = new String(currImage.ToCharArray(startingAt, finalPoint));
baseStyle.Add(string.Format("{0}url({1}) {2}", BackGroundCssProp, fileName, imageProps));
}
return baseStyle.ToArray();
Still I fail to understand, what you're trying to accomplish from the beginning to the end, but..If you are having an array of full paths and you need to get only filenames from these paths, you can do the following:
Actually files may contain random, absolutely different paths, but according to what I have caught from the question, et it be:
var files = Directory.GetFiles(#"path");
Then you may use Path.GetFileName Method to retrieve only filename from these paths, through a simple Enumerable.Select LINQ-statement:
var fileNamesOnly = files.Select(f => Path.GetFileName(f));
I am not entirely sure what you exactly need. For your sentence:
the string is currentDir i need to trim from each element in images[]
You can try the following using LINQ:
string currDir = "SomeString";
string[] images = new string[] { "SomeStringabc1.jpg", "SomeStringabc2.jpg", "SomeStringabc3.jpg", "abc.jpg" };
string[] newImages = images.Select(r => r.StartsWith(currDir)
? r.Replace(currDir, "") : r)
.ToArray();
Or using string.TrimStart
string[] newImages = images.Select(r => r.TrimStart(currDir.ToCharArray())).ToArray();
sorry but it is not clear to me... if you want only the filename from whole path then you can simply use Split for it, split the whole path with special character and use last array element.
once you will get all the path in your "images" array you can try below code.
for example:-
for(i=0;i<images.length;i++)
{
string [] cuttofilename=images[i].split('\');
string filename=cuttofilename[cuttofilename.lentgh-1];
}
I have a .txt file with a list of 174 different strings. Each string has an unique identifier.
For example:
123|this data is variable|
456|this data is variable|
789|so is this|
etc..
I wish to write a programe in C# that will read the .txt file and display only one of the 174 strings if I specify the ID of the string I want. This is because in the file I have all the data is variable so only the ID can be used to pull the string. So instead of ending up with the example about I get just one line.
eg just
123|this data is variable|
I seem to be able to write a programe that will pull just the ID from the .txt file and not the entire string or a program that mearly reads the whole file and displays it. But am yet to wirte on that does exactly what I need. HELP!
Well the actual string i get out from the txt file has no '|' they were just in the example. An example of the real string would be: 0111111(0010101) where the data in the brackets is variable. The brackets dont exsist in the real string either.
namespace String_reader
{
class Program
{
static void Main(string[] args)
{
String filepath = #"C:\my file name here";
string line;
if(File.Exists(filepath))
{
StreamReader file = null;
try
{
file = new StreamReader(filepath);
while ((line = file.ReadLine()) !=null)
{
string regMatch = "ID number here"; //this is where it all falls apart.
Regex.IsMatch (line, regMatch);
Console.WriteLine (line);// When program is run it just displays the whole .txt file
}
}
}
finally{
if (file !=null)
file.Close();
}
}
Console.ReadLine();
}
}
}
Use a Regex. Something along the lines of Regex.Match("|"+inputString+"|",#"\|[ ]*\d+\|(.+?)\|").Groups[1].Value
Oh, I almost forgot; you'll need to substitute the d+ for the actual index you want. Right now, that'll just get you the first one.
The "|" before and after the input string makes sure both the index and the value are enclosed in a | for all elements, including the first and last. There's ways of doing a Regex without it, but IMHO they just make your regex more complicated, and less readable.
Assuming you have path and id.
Console.WriteLine(File.ReadAllLines(path).Where(l => l.StartsWith(id + "|")).FirstOrDefault());
Use ReadLines to get a string array of lines then string split on the |
You could use Regex.Split method
FileInfo info = new FileInfo("filename.txt");
String[] lines = info.OpenText().ReadToEnd().Split(' ');
foreach(String line in lines)
{
int id = Convert.ToInt32(line.Split('|')[0]);
string text = Convert.ToInt32(line.Split('|')[1]);
}
Read the data into a string
Split the string on "|"
Read the items 2 by 2: key:value,key:value,...
Add them to a dictionary
Now you can easily find your string with dictionary[key].
first load the hole file to a string.
then try this:
string s = "123|this data is variable| 456|this data is also variable| 789|so is this|";
int index = s.IndexOf("123", 0);
string temp = s.Substring(index,s.Length-index);
string[] splitStr = temp.Split('|');
Console.WriteLine(splitStr[1]);
hope this is what you are looking for.
private static IEnumerable<string> ReadLines(string fspec)
{
using (var reader = new StreamReader(new FileStream(fspec, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
var dict = ReadLines("input.txt")
.Select(s =>
{
var split = s.Split("|".ToArray(), 2);
return new {Id = Int32.Parse(split[0]), Text = split[1]};
})
.ToDictionary(kv => kv.Id, kv => kv.Text);
Please note that with .NET 4.0 you don't need the ReadLines function, because there is ReadLines
You can now work with that as any dictionary:
Console.WriteLine(dict[12]);
Console.WriteLine(dict[999]);
No error handling here, please add your own
You can use Split method to divide the entire text into parts sepparated by '|'. Then all even elements will correspond to numbers odd elements - to strings.
StreamReader sr = new StreamReader(filename);
string text = sr.ReadToEnd();
string[] data = text.Split('|');
Then convert certain data elements to numbers and strings, i.e. int[] IDs and string[] Strs. Find the index of the given ID with idx = Array.FindIndex(IDs, ID.Equals) and the corresponding string will be Strs[idx]
List <int> IDs;
List <string> Strs;
for (int i = 0; i < data.Length - 1; i += 2)
{
IDs.Add(int.Parse(data[i]));
Strs.Add(data[i + 1]);
}
idx = Array.FindIndex(IDs, ID.Equals); // we get ID from input
answer = Strs[idx];
I have implemented an algorithm that will generate unique names for files that will save on hard drive. I'm appending DateTime: Hours,Minutes,Second and Milliseconds but still it generates duplicate name of files because im uploading multiple files at a time.
What is the best solution to generate unique names for files to be stored on hard drive so no 2 files are same?
If readability doesn't matter, use GUIDs.
E.g.:
var myUniqueFileName = string.Format(#"{0}.txt", Guid.NewGuid());
or shorter:
var myUniqueFileName = $#"{Guid.NewGuid()}.txt";
In my programs, I sometimes try e.g. 10 times to generate a readable name ("Image1.png"…"Image10.png") and if that fails (because the file already exists), I fall back to GUIDs.
Update:
Recently, I've also use DateTime.Now.Ticks instead of GUIDs:
var myUniqueFileName = string.Format(#"{0}.txt", DateTime.Now.Ticks);
or
var myUniqueFileName = $#"{DateTime.Now.Ticks}.txt";
The benefit to me is that this generates a shorter and "nicer looking" filename, compared to GUIDs.
Please note that in some cases (e.g. when generating a lot of random names in a very short time), this might make non-unique values.
Stick to GUIDs if you want to make really sure that the file names are unique, even when transfering them to other computers.
Use
Path.GetTempFileName()
or use new GUID().
Path.GetTempFilename() on MSDN.
System.IO.Path.GetRandomFileName()
Path.GetRandomFileName() on MSDN.
If the readability of the file name isn't important, then the GUID, as suggested by many will do. However, I find that looking into a directory with 1000 GUID file names is very daunting to sort through. So I usually use a combination of a static string which gives the file name some context information, a timestamp, and GUID.
For example:
public string GenerateFileName(string context)
{
return context + "_" + DateTime.Now.ToString("yyyyMMddHHmmssfff") + "_" + Guid.NewGuid().ToString("N");
}
filename1 = GenerateFileName("MeasurementData");
filename2 = GenerateFileName("Image");
This way, when I sort by filename, it will automatically group the files by the context string and sort by timestamp.
Note that the filename limit in windows is 255 characters.
Here's an algorithm that returns a unique readable filename based on the original supplied. If the original file exists, it incrementally tries to append an index to the filename until it finds one that doesn't exist. It reads the existing filenames into a HashSet to check for collisions so it's pretty quick (a few hundred filenames per second on my machine), it's thread safe too, and doesn't suffer from race conditions.
For example, if you pass it test.txt, it will attempt to create files in this order:
test.txt
test (2).txt
test (3).txt
etc. You can specify the maximum attempts or just leave it at the default.
Here's a complete example:
class Program
{
static FileStream CreateFileWithUniqueName(string folder, string fileName,
int maxAttempts = 1024)
{
// get filename base and extension
var fileBase = Path.GetFileNameWithoutExtension(fileName);
var ext = Path.GetExtension(fileName);
// build hash set of filenames for performance
var files = new HashSet<string>(Directory.GetFiles(folder));
for (var index = 0; index < maxAttempts; index++)
{
// first try with the original filename, else try incrementally adding an index
var name = (index == 0)
? fileName
: String.Format("{0} ({1}){2}", fileBase, index, ext);
// check if exists
var fullPath = Path.Combine(folder, name);
if(files.Contains(fullPath))
continue;
// try to create the file
try
{
return new FileStream(fullPath, FileMode.CreateNew, FileAccess.Write);
}
catch (DirectoryNotFoundException) { throw; }
catch (DriveNotFoundException) { throw; }
catch (IOException)
{
// Will occur if another thread created a file with this
// name since we created the HashSet. Ignore this and just
// try with the next filename.
}
}
throw new Exception("Could not create unique filename in " + maxAttempts + " attempts");
}
static void Main(string[] args)
{
for (var i = 0; i < 500; i++)
{
using (var stream = CreateFileWithUniqueName(#"c:\temp\", "test.txt"))
{
Console.WriteLine("Created \"" + stream.Name + "\"");
}
}
Console.ReadKey();
}
}
I use GetRandomFileName:
The GetRandomFileName method returns a cryptographically strong, random string that can be used as either a folder name or a file name. Unlike GetTempFileName, GetRandomFileName does not create a file. When the security of your file system is paramount, this method should be used instead of GetTempFileName.
Example:
public static string GenerateFileName(string extension="")
{
return string.Concat(Path.GetRandomFileName().Replace(".", ""),
(!string.IsNullOrEmpty(extension)) ? (extension.StartsWith(".") ? extension : string.Concat(".", extension)) : "");
}
You can have a unique file name automatically generated for you without any custom methods. Just use the following with the StorageFolder Class or the StorageFile Class. The key here is: CreationCollisionOption.GenerateUniqueName and NameCollisionOption.GenerateUniqueName
To create a new file with a unique filename:
var myFile = await ApplicationData.Current.LocalFolder.CreateFileAsync("myfile.txt", NameCollisionOption.GenerateUniqueName);
To copy a file to a location with a unique filename:
var myFile2 = await myFile1.CopyAsync(ApplicationData.Current.LocalFolder, myFile1.Name, NameCollisionOption.GenerateUniqueName);
To move a file with a unique filename in the destination location:
await myFile.MoveAsync(ApplicationData.Current.LocalFolder, myFile.Name, NameCollisionOption.GenerateUniqueName);
To rename a file with a unique filename in the destination location:
await myFile.RenameAsync(myFile.Name, NameCollisionOption.GenerateUniqueName);
Create your timestamped filename
following your normal process
Check to see if filename exists
False - save file
True - Append additional character to file, perhaps a counter
Go to step 2
Do you need the date time stamp in the filename?
You could make the filename a GUID.
I have been using the following code and its working fine. I hope this might help you.
I begin with a unique file name using a timestamp -
"context_" + DateTime.Now.ToString("yyyyMMddHHmmssffff")
C# code -
public static string CreateUniqueFile(string logFilePath, string logFileName, string fileExt)
{
try
{
int fileNumber = 1;
//prefix with . if not already provided
fileExt = (!fileExt.StartsWith(".")) ? "." + fileExt : fileExt;
//Generate new name
while (File.Exists(Path.Combine(logFilePath, logFileName + "-" + fileNumber.ToString() + fileExt)))
fileNumber++;
//Create empty file, retry until one is created
while (!CreateNewLogfile(logFilePath, logFileName + "-" + fileNumber.ToString() + fileExt))
fileNumber++;
return logFileName + "-" + fileNumber.ToString() + fileExt;
}
catch (Exception)
{
throw;
}
}
private static bool CreateNewLogfile(string logFilePath, string logFile)
{
try
{
FileStream fs = new FileStream(Path.Combine(logFilePath, logFile), FileMode.CreateNew);
fs.Close();
return true;
}
catch (IOException) //File exists, can not create new
{
return false;
}
catch (Exception) //Exception occured
{
throw;
}
}
Why can't we make a unique id as below.
We can use DateTime.Now.Ticks and Guid.NewGuid().ToString() to combine together and make a unique id.
As the DateTime.Now.Ticks is added, we can find out the Date and Time in seconds at which the unique id is created.
Please see the code.
var ticks = DateTime.Now.Ticks;
var guid = Guid.NewGuid().ToString();
var uniqueSessionId = ticks.ToString() +'-'+ guid; //guid created by combining ticks and guid
var datetime = new DateTime(ticks);//for checking purpose
var datetimenow = DateTime.Now; //both these date times are different.
We can even take the part of ticks in unique id and check for the date and time later for future reference.
You can attach the unique id created to the filename or can be used for creating unique session id for login-logout of users to our application or website.
How about using Guid.NewGuid() to create a GUID and use that as the filename (or part of the filename together with your time stamp if you like).
I've written a simple recursive function that generates file names like Windows does, by appending a sequence number prior to the file extension.
Given a desired file path of C:\MyDir\MyFile.txt, and the file already exists, it returns a final file path of C:\MyDir\MyFile_1.txt.
It is called like this:
var desiredPath = #"C:\MyDir\MyFile.txt";
var finalPath = UniqueFileName(desiredPath);
private static string UniqueFileName(string path, int count = 0)
{
if (count == 0)
{
if (!File.Exists(path))
{
return path;
}
}
else
{
var candidatePath = string.Format(
#"{0}\{1}_{2}{3}",
Path.GetDirectoryName(path),
Path.GetFileNameWithoutExtension(path),
count,
Path.GetExtension(path));
if (!File.Exists(candidatePath))
{
return candidatePath;
}
}
count++;
return UniqueFileName(path, count);
}
DateTime.Now.Ticks is not safe, Guid.NewGuid() is too ugly, if you need something clean and almost safe (it's not 100% safe for example if you call it 1,000,000 times in 1ms), try:
Math.Abs(Guid.NewGuid().GetHashCode())
By safe I mean safe to be unique when you call it so many times in very short period few ms of time.
If you would like to have the datetime,hours,minutes etc..you can use a static variable. Append the value of this variable to the filename. You can start the counter with 0 and increment when you have created a file. This way the filename will surely be unique since you have seconds also in the file.
I usually do something along these lines:
start with a stem file name (work.dat1 for instance)
try to create it with CreateNew
if that works, you've got the file, otherwise...
mix the current date/time into the filename (work.2011-01-15T112357.dat for instance)
try to create the file
if that worked, you've got the file, otherwise...
Mix a monotonic counter into the filename (work.2011-01-15T112357.0001.dat for instance. (I dislike GUIDs. I prefer order/predictability.)
try to create the file. Keep ticking up the counter and retrying until a file gets created for you.
Here's a sample class:
static class DirectoryInfoHelpers
{
public static FileStream CreateFileWithUniqueName( this DirectoryInfo dir , string rootName )
{
FileStream fs = dir.TryCreateFile( rootName ) ; // try the simple name first
// if that didn't work, try mixing in the date/time
if ( fs == null )
{
string date = DateTime.Now.ToString( "yyyy-MM-ddTHHmmss" ) ;
string stem = Path.GetFileNameWithoutExtension(rootName) ;
string ext = Path.GetExtension(rootName) ?? ".dat" ;
ext = ext.Substring(1);
string fn = string.Format( "{0}.{1}.{2}" , stem , date , ext ) ;
fs = dir.TryCreateFile( fn ) ;
// if mixing in the date/time didn't work, try a sequential search
if ( fs == null )
{
int seq = 0 ;
do
{
fn = string.Format( "{0}.{1}.{2:0000}.{3}" , stem , date , ++seq , ext ) ;
fs = dir.TryCreateFile( fn ) ;
} while ( fs == null ) ;
}
}
return fs ;
}
private static FileStream TryCreateFile(this DirectoryInfo dir , string fileName )
{
FileStream fs = null ;
try
{
string fqn = Path.Combine( dir.FullName , fileName ) ;
fs = new FileStream( fqn , FileMode.CreateNew , FileAccess.ReadWrite , FileShare.None ) ;
}
catch ( Exception )
{
fs = null ;
}
return fs ;
}
}
You might want to tweak the algorithm (always use all the possible components to the file name for instance). Depends on the context -- If I was creating log files for instance, that I might want to rotate out of existence, you'd want them all to share the same pattern to the name.
The code isn't perfect (no checks on the data passed in for instance). And the algorithm's not perfect (if you fill up the hard drive or encounter permissions, actual I/O errors or other file system errors, for instance, this will hang, as it stands, in an infinite loop).
I ends up concatenating GUID with Day Month Year Second Millisecond string and i think this solution is quite good in my scenario
you can use Random.Next() also to generate a random number. you can see the MSDN link: http://msdn.microsoft.com/en-us/library/9b3ta19y.aspx
I wrote a class specifically for doing this. It's initialized with a "base" part (defaults to a minute-accurate timestamp) and after that appends letters to make unique names. So, if the first stamp generated is 1907101215a, the second would be 1907101215b, then 1907101215c, et cetera.
If I need more than 25 unique stamps then I use unary 'z's to count 25's. So, it goes 1907101215y, 1907101215za, 1907101215zb, ... 1907101215zy, 1907101215zza, 1907101215zzb, and so forth. This guarantees that the stamps will always sort alphanumerically in the order they were generated (as long as the next character after the stamp isn't a letter).
It isn't thread-safe, doesn't automatically update the time, and quickly bloats if you need hundreds of stamps, but I find it sufficient for my needs.
/// <summary>
/// Class for generating unique stamps (for filenames, etc.)
/// </summary>
/// <remarks>
/// Each time ToString() is called, a unique stamp is generated.
/// Stamps are guaranteed to sort alphanumerically in order of generation.
/// </remarks>
public class StampGenerator
{
/// <summary>
/// All the characters which could be the last character in the stamp.
/// </summary>
private static readonly char[] _trailingChars =
{
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
'u', 'v', 'w', 'x', 'y'
};
/// <summary>
/// How many valid trailing characters there are.
/// </summary>
/// <remarks>Should always equal _trailingChars.Length</remarks>
public const int TRAILING_RANGE = 25;
/// <summary>
/// Maximum length of the stamp. Hard-coded for laziness.
/// </summary>
public const int MAX_LENGTH_STAMP = 28;
/// <summary>
/// Base portion of the stamp. Will be constant between calls.
/// </summary>
/// <remarks>
/// This is intended to uniquely distinguish between instances.
/// Default behavior is to generate a minute-accurate timestamp.
/// </remarks>
public string StampBase { get; }
/// <summary>
/// Number of times this instance has been called.
/// </summary>
public int CalledTimes { get; private set; }
/// <summary>
/// Maximum number of stamps that can be generated with a given base.
/// </summary>
public int MaxCalls { get; }
/// <summary>
/// Number of stamps remaining for this instance.
/// </summary>
public int RemainingCalls { get { return MaxCalls - CalledTimes; } }
/// <summary>
/// Instantiate a StampGenerator with a specific base.
/// </summary>
/// <param name="stampBase">Base of stamp.</param>
/// <param name="calledTimes">
/// Number of times this base has already been used.
/// </param>
public StampGenerator(string stampBase, int calledTimes = 0)
{
if (stampBase == null)
{
throw new ArgumentNullException("stampBase");
}
else if (Regex.IsMatch(stampBase, "[^a-zA-Z_0-9 \\-]"))
{
throw new ArgumentException("Invalid characters in Stamp Base.",
"stampBase");
}
else if (stampBase.Length >= MAX_LENGTH_STAMP - 1)
{
throw new ArgumentException(
string.Format("Stamp Base too long. (Length {0} out of {1})",
stampBase.Length, MAX_LENGTH_STAMP - 1), "stampBase");
}
else if (calledTimes < 0)
{
throw new ArgumentOutOfRangeException(
"calledTimes", calledTimes, "calledTimes cannot be negative.");
}
else
{
int maxCalls = TRAILING_RANGE * (MAX_LENGTH_STAMP - stampBase.Length);
if (calledTimes >= maxCalls)
{
throw new ArgumentOutOfRangeException(
"calledTimes", calledTimes, string.Format(
"Called Times too large; max for stem of length {0} is {1}.",
stampBase.Length, maxCalls));
}
else
{
StampBase = stampBase;
CalledTimes = calledTimes;
MaxCalls = maxCalls;
}
}
}
/// <summary>
/// Instantiate a StampGenerator with default base string based on time.
/// </summary>
public StampGenerator() : this(DateTime.Now.ToString("yMMddHHmm")) { }
/// <summary>
/// Generate a unique stamp.
/// </summary>
/// <remarks>
/// Stamp values are orered like this:
/// a, b, ... x, y, za, zb, ... zx, zy, zza, zzb, ...
/// </remarks>
/// <returns>A unique stamp.</returns>
public override string ToString()
{
int zCount = CalledTimes / TRAILING_RANGE;
int trailing = CalledTimes % TRAILING_RANGE;
int length = StampBase.Length + zCount + 1;
if (length > MAX_LENGTH_STAMP)
{
throw new InvalidOperationException(
"Stamp length overflown! Cannot generate new stamps.");
}
else
{
CalledTimes = CalledTimes + 1;
var builder = new StringBuilder(StampBase, length);
builder.Append('z', zCount);
builder.Append(_trailingChars[trailing]);
return builder.ToString();
}
}
}
Old question, I know, but here's is what works for me. If multiple threads download files, assign each a unique number and prepend to it the filename, e.g. 01_202107210938xxxx
If you're wanting to generate a file name based off of some text like a DateTime and maybe a GUID, I have made NuGet package that allows you to do this, if you count the amount of filenames you can use that as the seed so that it is truly random. I tried to make it as straight forward and as easy to use as possible, but here's some code that you can use to generate it:
First install the NuGet package https://www.nuget.org/packages/uniqueit/
Then import it.
Finally, enter the code below:
List<string> list = new List<string>();
list.Add(new DateTime().ToString());
list.Add("Some filename or GUID");
int amountoffiles = 5000;
string final_filename = vuniqueit.Identity.GenerateUUID(list, amountoffiles));
I want to include a batch file rename functionality in my application. A user can type a destination filename pattern and (after replacing some wildcards in the pattern) I need to check if it's going to be a legal filename under Windows. I've tried to use regular expression like [a-zA-Z0-9_]+ but it doesn't include many national-specific characters from various languages (e.g. umlauts and so on). What is the best way to do such a check?
From MSDN's "Naming a File or Directory," here are the general conventions for what a legal file name is under Windows:
You may use any character in the current code page (Unicode/ANSI above 127), except:
< > : " / \ | ? *
Characters whose integer representations are 0-31 (less than ASCII space)
Any other character that the target file system does not allow (say, trailing periods or spaces)
Any of the DOS names: CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 (and avoid AUX.txt, etc)
The file name is all periods
Some optional things to check:
File paths (including the file name) may not have more than 260 characters (that don't use the \?\ prefix)
Unicode file paths (including the file name) with more than 32,000 characters when using \?\ (note that prefix may expand directory components and cause it to overflow the 32,000 limit)
You can get a list of invalid characters from Path.GetInvalidPathChars and GetInvalidFileNameChars.
UPD: See Steve Cooper's suggestion on how to use these in a regular expression.
UPD2: Note that according to the Remarks section in MSDN "The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names." The answer provided by sixlettervaliables goes into more details.
For .Net Frameworks prior to 3.5 this should work:
Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.InvalidPathChars constant;
bool IsValidFilename(string testName)
{
Regex containsABadCharacter = new Regex("["
+ Regex.Escape(System.IO.Path.InvalidPathChars) + "]");
if (containsABadCharacter.IsMatch(testName)) { return false; };
// other checks for UNC, drive-path format, etc
return true;
}
For .Net Frameworks after 3.0 this should work:
http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars(v=vs.90).aspx
Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.GetInvalidPathChars() constant;
bool IsValidFilename(string testName)
{
Regex containsABadCharacter = new Regex("["
+ Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]");
if (containsABadCharacter.IsMatch(testName)) { return false; };
// other checks for UNC, drive-path format, etc
return true;
}
Once you know that, you should also check for different formats, eg c:\my\drive and \\server\share\dir\file.ext
Try to use it, and trap for the error. The allowed set may change across file systems, or across different versions of Windows. In other words, if you want know if Windows likes the name, hand it the name and let it tell you.
This class cleans filenames and paths; use it like
var myCleanPath = PathSanitizer.SanitizeFilename(myBadPath, ' ');
Here's the code;
/// <summary>
/// Cleans paths of invalid characters.
/// </summary>
public static class PathSanitizer
{
/// <summary>
/// The set of invalid filename characters, kept sorted for fast binary search
/// </summary>
private readonly static char[] invalidFilenameChars;
/// <summary>
/// The set of invalid path characters, kept sorted for fast binary search
/// </summary>
private readonly static char[] invalidPathChars;
static PathSanitizer()
{
// set up the two arrays -- sorted once for speed.
invalidFilenameChars = System.IO.Path.GetInvalidFileNameChars();
invalidPathChars = System.IO.Path.GetInvalidPathChars();
Array.Sort(invalidFilenameChars);
Array.Sort(invalidPathChars);
}
/// <summary>
/// Cleans a filename of invalid characters
/// </summary>
/// <param name="input">the string to clean</param>
/// <param name="errorChar">the character which replaces bad characters</param>
/// <returns></returns>
public static string SanitizeFilename(string input, char errorChar)
{
return Sanitize(input, invalidFilenameChars, errorChar);
}
/// <summary>
/// Cleans a path of invalid characters
/// </summary>
/// <param name="input">the string to clean</param>
/// <param name="errorChar">the character which replaces bad characters</param>
/// <returns></returns>
public static string SanitizePath(string input, char errorChar)
{
return Sanitize(input, invalidPathChars, errorChar);
}
/// <summary>
/// Cleans a string of invalid characters.
/// </summary>
/// <param name="input"></param>
/// <param name="invalidChars"></param>
/// <param name="errorChar"></param>
/// <returns></returns>
private static string Sanitize(string input, char[] invalidChars, char errorChar)
{
// null always sanitizes to null
if (input == null) { return null; }
StringBuilder result = new StringBuilder();
foreach (var characterToTest in input)
{
// we binary search for the character in the invalid set. This should be lightning fast.
if (Array.BinarySearch(invalidChars, characterToTest) >= 0)
{
// we found the character in the array of
result.Append(errorChar);
}
else
{
// the character was not found in invalid, so it is valid.
result.Append(characterToTest);
}
}
// we're done.
return result.ToString();
}
}
This is what I use:
public static bool IsValidFileName(this string expression, bool platformIndependent)
{
string sPattern = #"^(?!^(PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d|\..*)(\..+)?$)[^\x00-\x1f\\?*:\"";|/]+$";
if (platformIndependent)
{
sPattern = #"^(([a-zA-Z]:|\\)\\)?(((\.)|(\.\.)|([^\\/:\*\?""\|<>\. ](([^\\/:\*\?""\|<>\. ])|([^\\/:\*\?""\|<>]*[^\\/:\*\?""\|<>\. ]))?))\\)*[^\\/:\*\?""\|<>\. ](([^\\/:\*\?""\|<>\. ])|([^\\/:\*\?""\|<>]*[^\\/:\*\?""\|<>\. ]))?$";
}
return (Regex.IsMatch(expression, sPattern, RegexOptions.CultureInvariant));
}
The first pattern creates a regular expression containing the invalid/illegal file names and characters for Windows platforms only. The second one does the same but ensures that the name is legal for any platform.
One corner case to keep in mind, which surprised me when I first found out about it: Windows allows leading space characters in file names! For example, the following are all legal, and distinct, file names on Windows (minus the quotes):
"file.txt"
" file.txt"
" file.txt"
One takeaway from this: Use caution when writing code that trims leading/trailing whitespace from a filename string.
Simplifying the Eugene Katz's answer:
bool IsFileNameCorrect(string fileName){
return !fileName.Any(f=>Path.GetInvalidFileNameChars().Contains(f))
}
Or
bool IsFileNameCorrect(string fileName){
return fileName.All(f=>!Path.GetInvalidFileNameChars().Contains(f))
}
Microsoft Windows: Windows kernel forbids the use of characters in range 1-31 (i.e., 0x01-0x1F) and characters " * : < > ? \ |. Although NTFS allows each path component (directory or filename) to be 255 characters long and paths up to about 32767 characters long, the Windows kernel only supports paths up to 259 characters long. Additionally, Windows forbids the use of the MS-DOS device names AUX, CLOCK$, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, CON, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9, NUL and PRN, as well as these names with any extension (for example, AUX.txt), except when using Long UNC paths (ex. \.\C:\nul.txt or \?\D:\aux\con). (In fact, CLOCK$ may be used if an extension is provided.) These restrictions only apply to Windows - Linux, for example, allows use of " * : < > ? \ | even in NTFS.
Source: http://en.wikipedia.org/wiki/Filename
Rather than explicitly include all possible characters, you could do a regex to check for the presence of illegal characters, and report an error then. Ideally your application should name the files exactly as the user wishes, and only cry foul if it stumbles across an error.
The question is are you trying to determine if a path name is a legal windows path, or if it's legal on the system where the code is running.? I think the latter is more important, so personally, I'd probably decompose the full path and try to use _mkdir to create the directory the file belongs in, then try to create the file.
This way you know not only if the path contains only valid windows characters, but if it actually represents a path that can be written by this process.
I use this to get rid of invalid characters in filenames without throwing exceptions:
private static readonly Regex InvalidFileRegex = new Regex(
string.Format("[{0}]", Regex.Escape(#"<>:""/\|?*")));
public static string SanitizeFileName(string fileName)
{
return InvalidFileRegex.Replace(fileName, string.Empty);
}
Also CON, PRN, AUX, NUL, COM# and a few others are never legal filenames in any directory with any extension.
To complement the other answers, here are a couple of additional edge cases that you might want to consider.
Excel can have problems if you save a workbook in a file whose name contains the '[' or ']' characters. See http://support.microsoft.com/kb/215205 for details.
Sharepoint has a whole additional set of restrictions. See http://support.microsoft.com/kb/905231 for details.
From MSDN, here's a list of characters that aren't allowed:
Use almost any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:
The following reserved characters are not allowed:
< > : " / \ | ? *
Characters whose integer representations are in the range from zero through 31 are not allowed.
Any other character that the target file system does not allow.
This is an already answered question, but just for the sake of "Other options", here's a non-ideal one:
(non-ideal because using Exceptions as flow control is a "Bad Thing", generally)
public static bool IsLegalFilename(string name)
{
try
{
var fileInfo = new FileInfo(name);
return true;
}
catch
{
return false;
}
}
Also the destination file system is important.
Under NTFS, some files can not be created in specific directories.
E.G. $Boot in root
Regular expressions are overkill for this situation. You can use the String.IndexOfAny() method in combination with Path.GetInvalidPathChars() and Path.GetInvalidFileNameChars().
Also note that both Path.GetInvalidXXX() methods clone an internal array and return the clone. So if you're going to be doing this a lot (thousands and thousands of times) you can cache a copy of the invalid chars array for reuse.
many of these answers will not work if the filename is too long & running on a pre Windows 10 environment. Similarly, have a think about what you want to do with periods - allowing leading or trailing is technically valid, but can create problems if you do not want the file to be difficult to see or delete respectively.
This is a validation attribute I created to check for a valid filename.
public class ValidFileNameAttribute : ValidationAttribute
{
public ValidFileNameAttribute()
{
RequireExtension = true;
ErrorMessage = "{0} is an Invalid Filename";
MaxLength = 255; //superseeded in modern windows environments
}
public override bool IsValid(object value)
{
//http://stackoverflow.com/questions/422090/in-c-sharp-check-that-filename-is-possibly-valid-not-that-it-exists
var fileName = (string)value;
if (string.IsNullOrEmpty(fileName)) { return true; }
if (fileName.IndexOfAny(Path.GetInvalidFileNameChars()) > -1 ||
(!AllowHidden && fileName[0] == '.') ||
fileName[fileName.Length - 1]== '.' ||
fileName.Length > MaxLength)
{
return false;
}
string extension = Path.GetExtension(fileName);
return (!RequireExtension || extension != string.Empty)
&& (ExtensionList==null || ExtensionList.Contains(extension));
}
private const string _sepChar = ",";
private IEnumerable<string> ExtensionList { get; set; }
public bool AllowHidden { get; set; }
public bool RequireExtension { get; set; }
public int MaxLength { get; set; }
public string AllowedExtensions {
get { return string.Join(_sepChar, ExtensionList); }
set {
if (string.IsNullOrEmpty(value))
{ ExtensionList = null; }
else {
ExtensionList = value.Split(new char[] { _sepChar[0] })
.Select(s => s[0] == '.' ? s : ('.' + s))
.ToList();
}
} }
public override bool RequiresValidationContext => false;
}
and the tests
[TestMethod]
public void TestFilenameAttribute()
{
var rxa = new ValidFileNameAttribute();
Assert.IsFalse(rxa.IsValid("pptx."));
Assert.IsFalse(rxa.IsValid("pp.tx."));
Assert.IsFalse(rxa.IsValid("."));
Assert.IsFalse(rxa.IsValid(".pp.tx"));
Assert.IsFalse(rxa.IsValid(".pptx"));
Assert.IsFalse(rxa.IsValid("pptx"));
Assert.IsFalse(rxa.IsValid("a/abc.pptx"));
Assert.IsFalse(rxa.IsValid("a\\abc.pptx"));
Assert.IsFalse(rxa.IsValid("c:abc.pptx"));
Assert.IsFalse(rxa.IsValid("c<abc.pptx"));
Assert.IsTrue(rxa.IsValid("abc.pptx"));
rxa = new ValidFileNameAttribute { AllowedExtensions = ".pptx" };
Assert.IsFalse(rxa.IsValid("abc.docx"));
Assert.IsTrue(rxa.IsValid("abc.pptx"));
}
If you're only trying to check if a string holding your file name/path has any invalid characters, the fastest method I've found is to use Split() to break up the file name into an array of parts wherever there's an invalid character. If the result is only an array of 1, there are no invalid characters. :-)
var nameToTest = "Best file name \"ever\".txt";
bool isInvalidName = nameToTest.Split(System.IO.Path.GetInvalidFileNameChars()).Length > 1;
var pathToTest = "C:\\My Folder <secrets>\\";
bool isInvalidPath = pathToTest.Split(System.IO.Path.GetInvalidPathChars()).Length > 1;
I tried running this and other methods mentioned above on a file/path name 1,000,000 times in LinqPad.
Using Split() is only ~850ms.
Using Regex("[" + Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]") is around 6 seconds.
The more complicated regular expressions fair MUCH worse, as do some of the other options, like using the various methods on the Path class to get file name and let their internal validation do the job (most likely due to the overhead of exception handling).
Granted it's not very often you need to validation 1 million file names, so a single iteration is fine for most of these methods anyway. But it's still pretty efficient and effective if you're only looking for invalid characters.
I got this idea from someone. - don't know who. Let the OS do the heavy lifting.
public bool IsPathFileNameGood(string fname)
{
bool rc = Constants.Fail;
try
{
this._stream = new StreamWriter(fname, true);
rc = Constants.Pass;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message, "Problem opening file");
rc = Constants.Fail;
}
return rc;
}
Windows filenames are pretty unrestrictive, so really it might not even be that much of an issue. The characters that are disallowed by Windows are:
\ / : * ? " < > |
You could easily write an expression to check if those characters are present. A better solution though would be to try and name the files as the user wants, and alert them when a filename doesn't stick.
I suggest just use the Path.GetFullPath()
string tagetFileFullNameToBeChecked;
try
{
Path.GetFullPath(tagetFileFullNameToBeChecked)
}
catch(AugumentException ex)
{
// invalid chars found
}
My attempt:
using System.IO;
static class PathUtils
{
public static string IsValidFullPath([NotNull] string fullPath)
{
if (string.IsNullOrWhiteSpace(fullPath))
return "Path is null, empty or white space.";
bool pathContainsInvalidChars = fullPath.IndexOfAny(Path.GetInvalidPathChars()) != -1;
if (pathContainsInvalidChars)
return "Path contains invalid characters.";
string fileName = Path.GetFileName(fullPath);
if (fileName == "")
return "Path must contain a file name.";
bool fileNameContainsInvalidChars = fileName.IndexOfAny(Path.GetInvalidFileNameChars()) != -1;
if (fileNameContainsInvalidChars)
return "File name contains invalid characters.";
if (!Path.IsPathRooted(fullPath))
return "The path must be absolute.";
return "";
}
}
This is not perfect because Path.GetInvalidPathChars does not return the complete set of characters that are invalid in file and directory names and of course there's plenty more subtleties.
So I use this method as a complement:
public static bool TestIfFileCanBeCreated([NotNull] string fullPath)
{
if (string.IsNullOrWhiteSpace(fullPath))
throw new ArgumentException("Value cannot be null or whitespace.", "fullPath");
string directoryName = Path.GetDirectoryName(fullPath);
if (directoryName != null) Directory.CreateDirectory(directoryName);
try
{
using (new FileStream(fullPath, FileMode.CreateNew)) { }
File.Delete(fullPath);
return true;
}
catch (IOException)
{
return false;
}
}
It tries to create the file and return false if there is an exception. Of course, I need to create the file but I think it's the safest way to do that. Please also note that I am not deleting directories that have been created.
You can also use the first method to do basic validation, and then handle carefully the exceptions when the path is used.
This check
static bool IsValidFileName(string name)
{
return
!string.IsNullOrWhiteSpace(name) &&
name.IndexOfAny(Path.GetInvalidFileNameChars()) < 0 &&
!Path.GetFullPath(name).StartsWith(#"\\.\");
}
filters out names with invalid chars (<>:"/\|?* and ASCII 0-31), as well as reserved DOS devices (CON, NUL, COMx). It allows leading spaces and all-dot-names, consistent with Path.GetFullPath. (Creating file with leading spaces succeeds on my system).
Used .NET Framework 4.7.1, tested on Windows 7.
One liner for verifying illigal chars in the string:
public static bool IsValidFilename(string testName) => !Regex.IsMatch(testName, "[" + Regex.Escape(new string(System.IO.Path.InvalidPathChars)) + "]");
In my opinion, the only proper answer to this question is to try to use the path and let the OS and filesystem validate it. Otherwise you are just reimplementing (and probably poorly) all the validation rules that the OS and filesystem already use and if those rules are changed in the future you will have to change your code to match them.