How to use .Contains in LINQ using C# - c#

I am trying to find a substring of states in a file. I open the file and load each line a string at a time. I would then like to check if each string contains one of the states in my substring. It is not working as intended as it keeps returning "Could not find substring" even though I know that the states are in the string. What am I doing wrong?
EDIT: I realise now what the error, this line was completely wrong:
if (lines.Any(stringToCheck.Contains))
It should be like this:
if (stringToCheck.Any(s.Contains))
Thanks for the help guys.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace ConsoleApplication2
{
class Program
{
static void Main (string[] args)
{
string[] stringToCheck = {"Alabama","Alaska","Arizona","Arkansas","California","Colorado"};
string[] lines = File.ReadAllLines(#"C:\C# Project\sampledata.dat");
foreach (string s in lines)
{
if (lines.Any(stringToCheck.Contains))
{
Console.WriteLine("Found substring");
Console.WriteLine(s);
Console.ReadLine();
}
else
Console.WriteLine("Could not find substring");
Console.WriteLine(s);
Console.ReadLine() ;
}
}
}
}

You could use Any over the list of states to check if there is any string for each state which contains on the line. For sample:
if (stringToCheck.Any(x = > s.Contains(x))
{
// ...
}

You could do something like this:
string[] stringToCheck = { "Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado" };
//Some test lines
string[] lines = { "sadnaskjd Alabama", "sadasd Arizona", "asdasdaer" };
//A bool telling me if I found anything,
//you could skip the bool and just use else in the foreach :)
bool contains = false;
foreach ( var check in stringToCheck )
{
if ( lines.Any( l => l.Contains( check ) ) )
{
Console.WriteLine( "Found substring" );
Console.WriteLine( s );
contains = true;
}
}
if ( contains == false )
{
Console.WriteLine("Could not find substring");
Console.ReadLine();
}

Try this idea:
if (line.Any(x=>CheckKeyWordsInLine(line, stringToCheck)){
//The line contains one of the keywords
}else{
//The line does not contain any of the keywords
}
Then define the CheckKeyWordsInLine function as:
bool CheckKeyWordsInLine(string line, string[] keywords){
foreach(var key in keywords)
{
if(line.Contains(key))
{
return true;
}
}
return false;
}

you could do something like this:
if(stringToCheck.Any(e => s.Contains(e)){
//
}

Related

How do I parse and find a string in a C# script?

I've been trying to parse and search for a specific word in a big string, but I can't seem to be able to figure it out. I have created a script that connects a Twitch Channel's chat into unity.
An example of a message would be:
"#badge-info=subscriber/4;badges=moderator/1,subscriber/3,bits/1;bits=1;color=;display-name=TwitchUser1234;emotes=;flags=;id=da6ec4c6-af61-4346-abc-123456789;mod=1;room-id=12345678;subscriber=1;tmi-sent-ts=160987654321;turbo=0;user-id=123456789;user-type=mod :TwitchUser1234#TwitchUser1234.tmi.twitch.tv PRIVMSG #thechannelyouarewatching :PogChamp1 Another Test Bit"
I tried parsing and searching for the string 'bits' the message by doing:
private void GameInputs(string ChatInputs)
{
string Search;
Search = ChatInputs.Split(";", "=");
if(string "bits" in Search)
{
print("I made it here.");
}
}
I'm at a complete loss and have no idea how to do this. Any help is appreciated.
If my full code is needed it is:
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using System;
using System.ComponentModel;
using System.Net.Sockets;
using System.IO;
public class TwitchChat : MonoBehaviour
{
private TcpClient twitchClient;
private StreamReader reader;
private StreamWriter writer;
public string username, password, channelName; // http://twitchapps.com/tmi
// Start is called before the first frame update
void Start()
{
Connect();
}
void Update()
{
if(!twitchClient.Connected)
{
Connect();
}
ReadChat();
}
private void Connect()
{
twitchClient = new TcpClient("irc.chat.twitch.tv", 6667);
reader = new StreamReader(twitchClient.GetStream());
writer = new StreamWriter(twitchClient.GetStream());
writer.WriteLine("PASS " + password);
writer.WriteLine("NICK " + username);
writer.WriteLine("USER " + username + " 8 * :" + username);
writer.WriteLine("JOIN #" + channelName);
writer.WriteLine("CAP REQ :twitch.tv/tags");
writer.Flush();
}
private void ReadChat()
{
if (twitchClient.Available > 0)
{
var message = reader.ReadLine();
print(message);
GameInputs(message);
}
}
private void GameInputs(string ChatInputs)
{
string Search;
Search = ChatInputs.Split(";", "=");
if(string "bits" in Search)
{
print("I made it here.");
}
}
}
If you want to pull the value of "bits=xx" out, this would do it:
var b = value.Split(';').FirstOrDefault(s => s.StartsWith("bits="))?[5..];
b will be null if "bits=" is not present
If you're going to parse a lot of values out of this string consider turning it into a dictionary:
var c = new []{'='};
var d = value.Split(';').ToDictionary(s => s.Split(c,2)[0], s => s.Split(c,2)[1]);
It's slightly inefficient to split twice, if it bothers you, you can sub string:
value.Split(';').ToDictionary(s => s[..s.IndexOf('=')], s => s[s.IndexOf('=')+1..]);
This gives a dictionary of string, so you can do like:
if(d.ContainsKey("bits")){
var bits = int.Parse(d["bits"]);
...
String has a method Contains(string) that does the job:
if (ChantInputs.Contains("bits")
{
print("I made it here.");
}
You can try below.
private void GameInputs(string ChatInputs)
{
string[] Search = ChatInputs.Split(new char[] { ';', '=' });
foreach(string s in Search)
{
if(s == "bits")
{
print("I made it here.");
}
}
}
Below is the working code.
using System;
namespace ConsoleApp3
{
class Program
{
static void Main(string[] args)
{
string str = "one;two;test;three;test+test";
string[] strs = str.Split(new char[] { ';', '+' });
foreach(string s in strs)
{
if(s == "test")
{
Console.WriteLine(s);
}
}
Console.ReadLine();
}
}
}

C# WINWORD.exe won't exit - Sort Directory Files assending

I'm not so expert on C# but i tried to do my best ,
this is a small console application that should get 2 arg2 ( by passing args to the exe or by console input) i have 2 issue with it and i can't find any other solution
Merging the files is not in the correct order, if the files name has
letters included. ex.
(1.docx , 2.docx , 3.docx ) it work => result.docx(1,2,3)
(1test.docx , 2rice.docx , 3john.docx ) => result.docx(3,1,2)
Can't get WINWORD.exe to close after its the appliaction is
completed
PS: This exe is being called by PHP line CMD.EXE
i tried all possible commands to release com objects then close application ,
what is my mistake here? how to optimize the code correctly?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Office.Interop.Word;
using System.Reflection;
using System.Runtime.InteropServices;
using System.IO;
namespace DocEngine
{
public class Parameter
{
public string Name;
public string Value;
// Note we need to give a default constructor when override it
public Parameter()
{
}
}
class Program
{
public static void MergeDocuments(string fileName, List<string> documentFiles)
{
_Application oWord = new Microsoft.Office.Interop.Word.Application();
_Document oDoc = oWord.Documents.Add();
Selection oSelection = oWord.Selection;
foreach (string documentFile in documentFiles)
{
_Document oCurrentDocument = oWord.Documents.Add(documentFile);
CopyPageSetup(oCurrentDocument.PageSetup, oDoc.Sections.Last.PageSetup);
oCurrentDocument.Range().Copy();
oSelection.PasteAndFormat(WdRecoveryType.wdFormatOriginalFormatting);
if (!Object.ReferenceEquals(documentFile, documentFiles.Last()))
oSelection.InsertBreak(WdBreakType.wdSectionBreakNextPage);
}
oDoc.SaveAs(fileName, WdSaveFormat.wdFormatDocumentDefault);
oDoc.Close();
//TODO: release objects, close word application
//Marshal.ReleaseComObject(oSelection);
//Marshal.ReleaseComObject(oDoc);
//Marshal.ReleaseComObject(oWord);
Marshal.FinalReleaseComObject(oSelection);
Marshal.FinalReleaseComObject(oDoc);
Marshal.FinalReleaseComObject(oWord);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(oWord);
System.Runtime.InteropServices.Marshal.ReleaseComObject(oSelection);
System.Runtime.InteropServices.Marshal.ReleaseComObject(oDoc);
System.Runtime.InteropServices.Marshal.ReleaseComObject(oWord);
oSelection = null;
oDoc = null;
oWord = null;
// A good idea depending on who you talk to...
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
}
public static void CopyPageSetup(PageSetup source, PageSetup target)
{
target.PaperSize = source.PaperSize;
//target.Orientation = source.Orientation; //not working in word 2003, so here is another way
if (!source.Orientation.Equals(target.Orientation))
target.TogglePortrait();
target.TopMargin = source.TopMargin;
target.BottomMargin = source.BottomMargin;
target.RightMargin = source.RightMargin;
target.LeftMargin = source.LeftMargin;
target.FooterDistance = source.FooterDistance;
target.HeaderDistance = source.HeaderDistance;
target.LayoutMode = source.LayoutMode;
}
static void Main(string[] args)
{
if (args == null || args.Length == 0)
{
Console.WriteLine("Enter Path:");
Parameter parameter1 = new Parameter
{
Name = "dir",
Value = Console.ReadLine()
};
Console.WriteLine("FileName without ext:");
Parameter parameter2 = new Parameter
{
Name = "fileName",
Value = Console.ReadLine()
};
Console.WriteLine("Thank you! ");
Console.WriteLine("Test parameter1: [{0}] = [{1}]", parameter1.Name, parameter1.Value);
Console.WriteLine("Test parameter2: [{0}] = [{1}]", parameter2.Name, parameter2.Value);
try
{
List<string> result = Directory.EnumerateFiles(parameter1.Value, "*.doc", SearchOption.AllDirectories).Union(Directory.EnumerateFiles(parameter1.Value, "*.docx", SearchOption.AllDirectories)).ToList();
var filename = Path.Combine(parameter1.Value, parameter2.Value);
MergeDocuments(filename, result);
}
catch (UnauthorizedAccessException UAEx)
{
Console.WriteLine(UAEx.Message);
}
catch (PathTooLongException PathEx)
{
Console.WriteLine(PathEx.Message);
}
}
else
{
//args = new string[2];
string sourceDirectory = args[0];
string filename1 = args[1];
try
{
List<string> result = Directory.EnumerateFiles(sourceDirectory, "*.doc", SearchOption.AllDirectories).Union(Directory.EnumerateFiles(sourceDirectory, "*.docx", SearchOption.AllDirectories)).ToList();
var filename = Path.Combine(sourceDirectory, filename1);
MergeDocuments(filename, result);
}
catch (UnauthorizedAccessException UAEx)
{
Console.WriteLine(UAEx.Message);
}
catch (PathTooLongException PathEx)
{
Console.WriteLine(PathEx.Message);
}
}
}
}
}
You don't need to make any of those COM calls or explicit GC calls, and you don't need to explicitly set things to null. You just need to call Application.Quit.

Fortify shows critical vulnerability File.Delete() operation C#

The following code always shows path manipulation problem. How to resolve it ?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;
namespace PathManipulation
{
class Program
{
public string dir = null;
public void someFunction(string fileName)
{
// File.Delete(Regex.Replace(dir + fileName, #"\..\", String.Empty));
if (!(dir.IndexOf("//") >= 0) || !Regex.IsMatch(dir, "System32"))
{
String p = Regex.Replace(dir, #"..\", string.Empty);
DirectoryInfo di = new DirectoryInfo(p);
FileInfo[] fi = di.GetFiles();
if (fi.Length > 0)
{
for (int i = 0; i < fi.Length; i++)
{
if (fi[i].ToString().Equals(fileName))
{
Console.WriteLine(fi[i].ToString());
fi[i].Delete();
}
}
File.Delete(dir + fileName);
}
}
else
{
return;
}
}
static void Main(string[] args)
{
Program p = new Program();
p.dir = args[0];
p.someFunction(args[1]);
}
}
}
Yes, you break the flow of data so that the end user is not able to specify the file to be deleted.
For instance:
public void someFunction(int fileIndex){
...
if (fileIndex == 0){
File.Delete( "puppies.txt" );
}
else if (fileIndex == 1){
File.Delete( "kittens.txt" );
}
else {
throw new IllegalArgumentException( "Invalid delete index" );
}
}
That's an extreme way to solve the problem, but it does not allow the end user to delete ANYTHING the developer didn't intend.
Your data validation check:
if (!(dir.IndexOf("//") >= 0) || !Regex.IsMatch(dir, "System32"))
is weak. This is referred to as "blacklisting" and the attacker only has to figure out a pattern that is missed by your checks. So, #"C:\My Documents" for instance.
Instead, you should consider a "whitelisting" approach. Take a look at https://www.owasp.org/index.php/Data_Validation#Accept_known_good for a pretty thorough example. It doesn't address path injection directly. You simply have to think hard about what files/directories you expect to receive. Throw an error if the input deviates from that. With a little testing you will create a good whitelist.

String Matching

I have a string
String mainString="///BUY/SELL///ORDERTIME///RT///QTY///BROKERAGE///NETRATE///AMOUNTRS///RATE///SCNM///";
Now I have another strings
String str1= "RT";
which should be matched only with RT which is substring of string mainString but not with ORDERTIME which is also substring of string mainString.
String str2= "RATE" ;
And RATE(str2) should be matched with RATE which is substring of string mainString but not with NETRATE which is also substring of string mainString.
How can we do that ?
Match against "///RT///" and "///RATE///".
This might give you some clues - no where near real code quality, and only a 5 minute job to come with this shoddy solution but does do what you need. it smells lots be warned.
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
using System.Text;
namespace test {
class Program {
static void Main(string[] args) {
String mainString="//BUY/SELL//ORDERTIME//RT//QTY//BROKERAGE//NETRATE//AMOUNTRS//RATE//SCNM//";
Hashtable ht = createHashTable(mainString);
if (hasValue("RA", ht)) {
Console.WriteLine("Matched RA");
} else {
Console.WriteLine("Didnt Find RA");
}
if (hasValue("RATE", ht)) {
Console.WriteLine("Matched RATE");
}
Console.Read();
}
public static Hashtable createHashTable(string strToSplit) {
Hashtable ht = new Hashtable();
int iCount = 0;
string[] words = strToSplit.Split(new Char[] { '/', '/', '/' });
foreach (string word in words) {
ht.Add(iCount++, word);
}
return ht;
}
public static bool hasValue(string strValuetoSearch, Hashtable ht) {
return ht.ContainsValue(strValuetoSearch);
}
}
}
as far as I understand your question you want to match a string between /// as delimiters.
if you look for str you just have to do
Regex.Match(mainString, "(^|///)" + str + "(///|$)");
I don't know it will work every time or not.But I have tried this and it works right now in this string matching. I want to know whether this is ok or not,please give me suggestion.
str1 = str1.Insert(0, "///");
str1=str1.Insert(str1.Length,"///");
bool Result = mainString.Contains(str1);
What about Linq to Object?
String mainString="///BUY/SELL///ORDERTIME///RT///QTY///BROKERAGE///NETRATE///AMOUNTRS///RATE///SCNM///";
String searchTerm = "RT";
String[] src = mainString.split('///');
var match = from word in src where
word.ToLowerInvariant() == searchTerm.ToLowerInvariant()
select word;
I don't have VS near me so I can't test it, but it should be something similar to this.

C# Sanitize File Name

I recently have been moving a bunch of MP3s from various locations into a repository. I had been constructing the new file names using the ID3 tags (thanks, TagLib-Sharp!), and I noticed that I was getting a System.NotSupportedException:
"The given path's format is not supported."
This was generated by either File.Copy() or Directory.CreateDirectory().
It didn't take long to realize that my file names needed to be sanitized. So I did the obvious thing:
public static string SanitizePath_(string path, char replaceChar)
{
string dir = Path.GetDirectoryName(path);
foreach (char c in Path.GetInvalidPathChars())
dir = dir.Replace(c, replaceChar);
string name = Path.GetFileName(path);
foreach (char c in Path.GetInvalidFileNameChars())
name = name.Replace(c, replaceChar);
return dir + name;
}
To my surprise, I continued to get exceptions. It turned out that ':' is not in the set of Path.GetInvalidPathChars(), because it is valid in a path root. I suppose that makes sense - but this has to be a pretty common problem. Does anyone have some short code that sanitizes a path? The most thorough I've come up with this, but it feels like it is probably overkill.
// replaces invalid characters with replaceChar
public static string SanitizePath(string path, char replaceChar)
{
// construct a list of characters that can't show up in filenames.
// need to do this because ":" is not in InvalidPathChars
if (_BadChars == null)
{
_BadChars = new List<char>(Path.GetInvalidFileNameChars());
_BadChars.AddRange(Path.GetInvalidPathChars());
_BadChars = Utility.GetUnique<char>(_BadChars);
}
// remove root
string root = Path.GetPathRoot(path);
path = path.Remove(0, root.Length);
// split on the directory separator character. Need to do this
// because the separator is not valid in a filename.
List<string> parts = new List<string>(path.Split(new char[]{Path.DirectorySeparatorChar}));
// check each part to make sure it is valid.
for (int i = 0; i < parts.Count; i++)
{
string part = parts[i];
foreach (char c in _BadChars)
{
part = part.Replace(c, replaceChar);
}
parts[i] = part;
}
return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
}
Any improvements to make this function faster and less baroque would be much appreciated.
To clean up a file name you could do this
private static string MakeValidFileName( string name )
{
string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
string invalidRegStr = string.Format( #"([{0}]*\.+$)|([{0}]+)", invalidChars );
return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}
A shorter solution:
var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');
Based on Andre's excellent answer but taking into account Spud's comment on reserved words, I made this version:
/// <summary>
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// </summary>
/// <remarks>
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// </remarks>
public static string CoerceValidFileName(string filename)
{
var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
var invalidReStr = string.Format(#"[{0}]+", invalidChars);
var reservedWords = new []
{
"CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
"COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
"LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
};
var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
foreach (var reservedWord in reservedWords)
{
var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
}
return sanitisedNamePart;
}
And these are my unit tests
[Test]
public void CoerceValidFileName_SimpleValid()
{
var filename = #"thisIsValid.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual(filename, result);
}
[Test]
public void CoerceValidFileName_SimpleInvalid()
{
var filename = #"thisIsNotValid\3\\_3.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}
[Test]
public void CoerceValidFileName_InvalidExtension()
{
var filename = #"thisIsNotValid.t\xt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("thisIsNotValid.t_xt", result);
}
[Test]
public void CoerceValidFileName_KeywordInvalid()
{
var filename = "aUx.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("_reservedWord_.txt", result);
}
[Test]
public void CoerceValidFileName_KeywordValid()
{
var filename = "auxillary.txt";
var result = PathHelper.CoerceValidFileName(filename);
Assert.AreEqual("auxillary.txt", result);
}
string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));
there are a lot of working solutions here. just for the sake of completeness, here's an approach that doesn't use regex, but uses LINQ:
var invalids = Path.GetInvalidFileNameChars();
filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_'));
Also, it's a very short solution ;)
I'm using the System.IO.Path.GetInvalidFileNameChars() method to check invalid characters and I've got no problems.
I'm using the following code:
foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars())
{
filename = filename.Replace(invalidchar, '_');
}
I wanted to retain the characters in some way, not just simply replace the character with an underscore.
One way I thought was to replace the characters with similar looking characters which are (in my situation), unlikely to be used as regular characters. So I took the list of invalid characters and found look-a-likes.
The following are functions to encode and decode with the look-a-likes.
This code does not include a complete listing for all System.IO.Path.GetInvalidFileNameChars() characters. So it is up to you to extend or utilize the underscore replacement for any remaining characters.
private static Dictionary<string, string> EncodeMapping()
{
//-- Following characters are invalid for windows file and folder names.
//-- \/:*?"<>|
Dictionary<string, string> dic = new Dictionary<string, string>();
dic.Add(#"\", "Ì"); // U+OOCC
dic.Add("/", "Í"); // U+OOCD
dic.Add(":", "¦"); // U+00A6
dic.Add("*", "¤"); // U+00A4
dic.Add("?", "¿"); // U+00BF
dic.Add(#"""", "ˮ"); // U+02EE
dic.Add("<", "«"); // U+00AB
dic.Add(">", "»"); // U+00BB
dic.Add("|", "│"); // U+2502
return dic;
}
public static string Escape(string name)
{
foreach (KeyValuePair<string, string> replace in EncodeMapping())
{
name = name.Replace(replace.Key, replace.Value);
}
//-- handle dot at the end
if (name.EndsWith(".")) name = name.CropRight(1) + "°";
return name;
}
public static string UnEscape(string name)
{
foreach (KeyValuePair<string, string> replace in EncodeMapping())
{
name = name.Replace(replace.Value, replace.Key);
}
//-- handle dot at the end
if (name.EndsWith("°")) name = name.CropRight(1) + ".";
return name;
}
You can select your own look-a-likes. I used the Character Map app in windows to select mine %windir%\system32\charmap.exe
As I make adjustments through discovery, I will update this code.
I think the problem is that you first call Path.GetDirectoryName on the bad string. If this has non-filename characters in it, .Net can't tell which parts of the string are directories and throws. You have to do string comparisons.
Assuming it's only the filename that is bad, not the entire path, try this:
public static string SanitizePath(string path, char replaceChar)
{
int filenamePos = path.LastIndexOf(Path.DirectorySeparatorChar) + 1;
var sb = new System.Text.StringBuilder();
sb.Append(path.Substring(0, filenamePos));
for (int i = filenamePos; i < path.Length; i++)
{
char filenameChar = path[i];
foreach (char c in Path.GetInvalidFileNameChars())
if (filenameChar.Equals(c))
{
filenameChar = replaceChar;
break;
}
sb.Append(filenameChar);
}
return sb.ToString();
}
I have had success with this in the past.
Nice, short and static :-)
public static string returnSafeString(string s)
{
foreach (char character in Path.GetInvalidFileNameChars())
{
s = s.Replace(character.ToString(),string.Empty);
}
foreach (char character in Path.GetInvalidPathChars())
{
s = s.Replace(character.ToString(), string.Empty);
}
return (s);
}
Here's an efficient lazy loading extension method based on Andre's code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace LT
{
public static class Utility
{
static string invalidRegStr;
public static string MakeValidFileName(this string name)
{
if (invalidRegStr == null)
{
var invalidChars = System.Text.RegularExpressions.Regex.Escape(new string(System.IO.Path.GetInvalidFileNameChars()));
invalidRegStr = string.Format(#"([{0}]*\.+$)|([{0}]+)", invalidChars);
}
return System.Text.RegularExpressions.Regex.Replace(name, invalidRegStr, "_");
}
}
}
Your code would be cleaner if you appended the directory and filename together and sanitized that rather than sanitizing them independently. As for sanitizing away the :, just take the 2nd character in the string. If it is equal to "replacechar", replace it with a colon. Since this app is for your own use, such a solution should be perfectly sufficient.
using System;
using System.IO;
using System.Linq;
using System.Text;
public class Program
{
public static void Main()
{
try
{
var badString = "ABC\\DEF/GHI<JKL>MNO:PQR\"STU\tVWX|YZA*BCD?EFG";
Console.WriteLine(badString);
Console.WriteLine(SanitizeFileName(badString, '.'));
Console.WriteLine(SanitizeFileName(badString));
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
private static string SanitizeFileName(string fileName, char? replacement = null)
{
if (fileName == null) { return null; }
if (fileName.Length == 0) { return ""; }
var sb = new StringBuilder();
var badChars = Path.GetInvalidFileNameChars().ToList();
foreach (var #char in fileName)
{
if (badChars.Contains(#char))
{
if (replacement.HasValue)
{
sb.Append(replacement.Value);
}
continue;
}
sb.Append(#char);
}
return sb.ToString();
}
}
Based #fiat's and #Andre's approach, I'd like to share my solution too.
Main difference:
its an extension method
regex is compiled at first use to save some time with a lot executions
reserved words are preserved
public static class StringPathExtensions
{
private static Regex _invalidPathPartsRegex;
static StringPathExtensions()
{
var invalidReg = System.Text.RegularExpressions.Regex.Escape(new string(Path.GetInvalidFileNameChars()));
_invalidPathPartsRegex = new Regex($"(?<reserved>^(CON|PRN|AUX|CLOCK\\$|NUL|COM0|COM1|COM2|COM3|COM4|COM5|COM6|COM7|COM8|COM9|LPT0|LPT1|LPT2|LPT3|LPT4|LPT5|LPT6|LPT7|LPT8|LPT9))|(?<invalid>[{invalidReg}:]+|\\.$)", RegexOptions.Compiled);
}
public static string SanitizeFileName(this string path)
{
return _invalidPathPartsRegex.Replace(path, m =>
{
if (!string.IsNullOrWhiteSpace(m.Groups["reserved"].Value))
return string.Concat("_", m.Groups["reserved"].Value);
return "_";
});
}
}

Categories

Resources