Remove specific characters except last

Remove specific characters except last - c#

I have a text string and I want to replace the dots with underscores except for the last character found in the string.
Example:
input = "video.coffee.example.mp4"
result = "video_coffe_example.mp4"
I have a code but this replaces everything including the last character
first option failed
static string replaceForUnderScore(string file)
{
return file = file.Replace(".", "_");
}
I implemented a second option that works for me but I find that it is very extensive and not very optimized
static string replaceForUnderScore(string file)
{
string result = "";
var splits = file.Split(".");
var extension = splits.LastOrDefault();
splits = splits.Take(splits.Count() - 1).ToArray();
foreach (var strItem in splits)
{
result = result + "_" + strItem;
}
result = result.Substring(1, result.Length-1);
string finalResult = result + "."+extension;
return finalResult;
}
Is there a better way to do it?

Since you work with files, I suggest using Path class: all
we want is to change file name only while keeping extension intact:
static string replaceForUnderScore(string file) =>
Path.GetFileNameWithoutExtension(file).Replace('.', '_') + Path.GetExtension(file);

You can replace all the dots with an underscore except for the last dot by asserting that there is still a dot present to the right when matching one.
string result = Regex.Replace(input, #"\.(?=[^.]*\.)", "_");
The result will be
video_coffee_example.mp4

Regex will help you to do this.
Add the namespace using System.Text.RegularExpressions;
And use this code:
var regex = new Regex(Regex.Escape("."));
var newText = regex.Replace("video.coffee.example.mp4", "_", 2);
Here we specified the maximum number of times to replace the .
The output would be the following:
video_coffee_example.mp4
Additionally, you can update the code to replace any number of dots excluding the last one.
var replaceChar = '.';
var regex = new Regex(Regex.Escape(replaceChar.ToString()));
var replaceWith = "_";
// The text to process
var text = "video.coffee.example.mp4";
// Count how many chars to replace excluding extension
var replaceCount = text.Count(s => s == replaceChar) - 1;
var newText = regex.Replace(text, replaceWith, replaceCount);

Off the top of my head but this might work.
return $"{file.Replace(".mp4","").Replace(".","_")}.mp4";

The simplest (and probably fastest) way is just to iterate over the string:
static string replaceForUnderScore(string file)
{
StringBuilder sb = new StringBuilder( file.Length ) ;
int lastDot = -1 ;
for ( int i = 0 ; i < file.Length ; ++i )
{
char c = file[i] ;
// if we found a '.', replace it with '_' and save its position
if ( c == '.' )
{
c = '_' ;
lastDot = i ;
}
sb.Append( c ) ;
}
// if we changed any '.' to '_', convert the last such replacement back to '.'
if ( lastDot >= 0 )
{
sb.Replace ( '.' , '_' , lastDot, 1 );
}
return sb.ToString();
}
Another approach would be to use System.IO.Path. It's certainly the most succinct:
static string replaceForUnderScore( string file )
{
string ext = Path.GetExtension( file ) ;
string name = Path
.GetFileNameWithoutExtension( file )
.Replace( '.' , '_' )
;
return Path.ChangeExtension( name , ext ) ;
}

Related

c# get the first ';' after parentheses

i feel dumb for asking a most likely silly question.
I am helping someone getting the results he wishes for his custom compiler that reads all lines of an xml file in one string so it will look like below, and since he wants it to "Support" to call variables inside the array worst case scenario would look like below:
"Var1 = [5,4,3,2]; Var2 = [2,8,6,Var1;4];"
What i need is to find the first ";" after "[" and "]" and split it, so i stand with this:
"Var1 = [5,4,3,2];
It will also have to support multiple "[", "]" for example:
"Var2 = [5,Var1,[4],2];"
EDIT: There may also be Data in between the last "]" and ";"
For example:
"Var2 = [5,[4],2]Var1;
What can i do here? Im kind of stuck.

You can try regular expressions, e.g.
string source = "Var1 = [5,4,3,2]; Var2 = [2,8,6,Var1;4];";
// 1. final (or the only) chunk doesn't necessary contain '];':
// "abc" -> "abc"
// 2. chunk has at least one symbol except '];'
string pattern = ".+?(][a-zA-Z0-9]*;|$)";
var items = Regex
.Matches(source, pattern)
.OfType<Match>()
.Select(match => match.Value)
.ToArray();
Console.Write(string.Join(Environment.NewLine, items));
Outcome:
Var1 = [5,4,3,2]abc123;
Var2 = [2,8,6,Var1;4];

^([^;]+);
This regex should work for all.
You can use it like here:
string[] lines =
{
"Var1 = [5,4,3,2]; Var2 = [2,8,6,Var1;4];",
"Var2 = [5,[4],2]Var1; Var2 = [2,8,6,Var1;4];"
};
Regex pattern = new Regex(#"^([^;]+);");
foreach (string s in lines){
Match match = pattern.Match(s);
if (match.Success)
{
Console.WriteLine(match.Value);
}
}
The explanation is:
^ means starts with and is [^;] anything but a semicolon
+ means repeated one or more times and is ; followed by a semicolon
This will find Var1 = [5,4,3,2]; as well as Var1 = [5,4,3,2];
You can see the output HERE

public static string Extract(string str, char splitOn)
{
var split = false;
var count = 0;
var bracketCount = 0;
foreach (char c in str)
{
count++;
if (split && c == splitOn)
return str.SubString(0, count);
if (c == '[')
{
bracketCount++;
split = false;
}
else if (c == ']')
{
bracketCount--;
if (bracketCount == 0)
{
split = true;
}
else if (bracketCount < 0)
throw new FormatException(); //?
}
}
return str;
}

How do I remove multiple offending characters from my string? [duplicate]

This question already has answers here:
Remove characters from C# string
(22 answers)
Closed 9 years ago.
Here's my working code:
string Input;
string Output;
Input = data;
Output = Input.Replace(#")", "");
Here, I am simply removing the parentheses ")" from my string, if it exists. Now how do I expand the list of offending characters like ")" to include "(" and "-" as well?
I realize I can write 2 more Output-like statements, but I'm wondering if there is a better way...

If you're just doing a couple replacements (I see you're only doing three), the easiest way without worrying about Regex or StringBuilders is to chain three Replace calls into one statement:
Output = Input.Replace("(", "").Replace(")", "").Replace("-", "");
... which is marginally better than storing the result in Output every time.

Output = Regex.Replace(Input, "[()-]", "");
The [] characters in the expression create a character class. It doesn't match those character directly.

LINQ solution:
Output = new String(Input.Except("()-").ToArray());

As an alternative to Regex, it may be easier to manage as a collection of replacements and doing the replaces using a StringBuilder.
var replacements = new[] { ")", "-" };
var output = new StringBuilder(Input);
foreach (var r in replacements)
output.Replace(r, string.Empty);

You can use Regex.Replace(), documented here.

You can use a List which contains your badwords. Now just use a foreach loop to iterate over it and replace every bad string.
StringBuilder output = new StringBuilder("(Hello) W,o.r;ld");
List<string> badwords = new List<string>();
badwords.Add("(");
badwords.Add(")");
badwords.Add(",");
badwords.Add(".");
badwords.Add(";");
badwords.ForEach(bad => output = output.Replace(bad, String.Empty));
//Result "Hello World"
Kind regards.
//Edit:
Implemented changes suggested by Khan.

This will allow you to do same thing also
private static string ReplaceBadWords(string[] BadStrings, string input)
{
StringBuilder sb = new StringBuilder(input);
BadStrings.ToList().ForEach(b =>
{
if(b != "")
{
sb = sb.Replace(b, string.Empty);
}
});
return sb.ToString();
}
Sample usage would be
string[] BadStrings = new string[]
{
")",
"(",
"random",
""
};
string input = "Some random text()";
string output = ReplaceBadWords(BadStrings, input);

I'd probably use a regular expression as it's terse and to the point. If you're scared of regular expression, you can teach the computer to write them for you. Here's a simple class for cleaning strings: you just provide it with a list of invalid characters:
class StringCleaner
{
private Regex regex ;
public StringCleaner( string invalidChars ) : this ( (IEnumerable<char>) invalidChars )
{
return ;
}
public StringCleaner ( params char[] invalidChars ) : this( (IEnumerable<char>) invalidChars )
{
return ;
}
public StringCleaner( IEnumerable<char> invalidChars )
{
const string HEX = "0123456789ABCDEF" ;
SortedSet<char> charSet = new SortedSet<char>( invalidChars ) ;
StringBuilder sb = new StringBuilder( 2 + 6*charset.Count ) ;
sb.Append('[') ;
foreach ( ushort c in charSet )
{
sb.Append(#"\u" )
.Append( HEX[ ( c >> 12 ) & 0x000F ] )
.Append( HEX[ ( c >> 8 ) & 0x000F ] )
.Append( HEX[ ( c >> 4 ) & 0x000F ] )
.Append( HEX[ ( c >> 0 ) & 0x000F ] )
;
}
sb.Append(']') ;
this.regex = new Regex( sb.ToString() ) ;
}
public string Clean( string s )
{
if ( string.IsNullOrEmpty(s) ) return s ;
string value = this.regex.Replace(s,"") ;
return value ;
}
}
Once you have that, it's easy:
static void Main(string[] args)
{
StringCleaner cleaner = new StringCleaner( "aeiou" ) ;
string dirty = "The quick brown fox jumped over the lazy dog." ;
string clean = cleaner.Clean(dirty) ;
Console.WriteLine( clean ) ;
return;
}
At the end of which clean is Th qck brwn fx jmpd vr th lzy dg.
Easy!

Keep only numeric value from a string?

I have some strings like this
string phoneNumber = "(914) 395-1430";
I would like to strip out the parethenses and the dash, in other word just keep the numeric values.
So the output could look like this
9143951430
How do I get the desired output ?

You do any of the following:
Use regular expressions. You can use a regular expression with either
A negative character class that defines the characters that are what you don't want (those characters other than decimal digits):
private static readonly Regex rxNonDigits = new Regex( #"[^\d]+");
In which case, you can do take either of these approaches:
// simply replace the offending substrings with an empty string
private string CleanStringOfNonDigits_V1( string s )
{
if ( string.IsNullOrEmpty(s) ) return s ;
string cleaned = rxNonDigits.Replace(s, "") ;
return cleaned ;
}
// split the string into an array of good substrings
// using the bad substrings as the delimiter. Then use
// String.Join() to splice things back together.
private string CleanStringOfNonDigits_V2( string s )
{
if (string.IsNullOrEmpty(s)) return s;
string cleaned = String.Join( rxNonDigits.Split(s) );
return cleaned ;
}
a positive character set that defines what you do want (decimal digits):
private static Regex rxDigits = new Regex( #"[\d]+") ;
In which case you can do something like this:
private string CleanStringOfNonDigits_V3( string s )
{
if ( string.IsNullOrEmpty(s) ) return s ;
StringBuilder sb = new StringBuilder() ;
for ( Match m = rxDigits.Match(s) ; m.Success ; m = m.NextMatch() )
{
sb.Append(m.Value) ;
}
string cleaned = sb.ToString() ;
return cleaned ;
}
You're not required to use a regular expression, either.
You could use LINQ directly, since a string is an IEnumerable<char>:
private string CleanStringOfNonDigits_V4( string s )
{
if ( string.IsNullOrEmpty(s) ) return s;
string cleaned = new string( s.Where( char.IsDigit ).ToArray() ) ;
return cleaned;
}
If you're only dealing with western alphabets where the only decimal digits you'll see are ASCII, skipping char.IsDigit will likely buy you a little performance:
private string CleanStringOfNonDigits_V5( string s )
{
if (string.IsNullOrEmpty(s)) return s;
string cleaned = new string(s.Where( c => c-'0' < 10 ).ToArray() ) ;
return cleaned;
}
Finally, you can simply iterate over the string, chucking the digits you don't want, like this:
private string CleanStringOfNonDigits_V6( string s )
{
if (string.IsNullOrEmpty(s)) return s;
StringBuilder sb = new StringBuilder(s.Length) ;
for (int i = 0; i < s.Length; ++i)
{
char c = s[i];
if ( c < '0' ) continue ;
if ( c > '9' ) continue ;
sb.Append(s[i]);
}
string cleaned = sb.ToString();
return cleaned;
}
Or this:
private string CleanStringOfNonDigits_V7(string s)
{
if (string.IsNullOrEmpty(s)) return s;
StringBuilder sb = new StringBuilder(s);
int j = 0 ;
int i = 0 ;
while ( i < sb.Length )
{
bool isDigit = char.IsDigit( sb[i] ) ;
if ( isDigit )
{
sb[j++] = sb[i++];
}
else
{
++i ;
}
}
sb.Length = j;
string cleaned = sb.ToString();
return cleaned;
}
From a standpoint of clarity and cleanness of code, the version 1 is what you want. It's hard to beat a one liner.
If performance matters, my suspicion is that the version 7, the last version, is the winner. It creates one temporary — a StringBuilder() and does the transformation in-place within the StringBuilder's in-place buffer.
The other options all do more work.

use reg expression
string result = Regex.Replace(phoneNumber, #"[^\d]", "");

try something like this
return new String(input.Where(Char.IsDigit).ToArray());

string phoneNumber = "(914) 395-1430";
var numbers = String.Join("", phoneNumber.Where(char.IsDigit));

He means everything #gleng
Regex rgx = new Regex(#"\D");
str = rgx.Replace(str, "");

Instead of a regular expression, you can use a LINQ method:
phoneNumber = String.Concat(phoneNumber.Where(c => c >= '0' && c <= '9'));
or:
phoneNumber = String.Concat(phoneNumber.Where(Char.IsDigit));

How to insert spaces between the characters of a string

Is there an easy method to insert spaces between the characters of a string? I'm using the below code which takes a string (for example ( UI$.EmployeeHours * UI.DailySalary ) / ( Month ) ) . As this information is getting from an excel sheet, i need to insert [] for each columnname. The issue occurs if user avoids giving spaces after each paranthesis as well as an operator. AnyOne to help?
text = e.Expression.Split(Splitter);
string expressionString = null;
for (int temp = 0; temp < text.Length; temp++)
{
string str = null;
str = text[temp];
if (str.Length != 1 && str != "")
{
expressionString = expressionString + "[" + text[temp].TrimEnd() + "]";
}
else
expressionString = expressionString + str;
}
User might be inputing something like (UI$.SlNo-UI+UI$.Task)-(UI$.Responsible_Person*UI$.StartDate) while my desired output is ( [UI$.SlNo-UI] + [UI$.Task] ) - ([UI$.Responsible_Person] * [UI$.StartDate] )

Here is a short way to insert spaces after every single character in a string (which I know isn't exactly what you were asking for):
var withSpaces = withoutSpaces.Aggregate(string.Empty, (c, i) => c + i + ' ');
This generates a string the same as the first, except with a space after each character (including the last character).

You can do that with regular expressions:
using System.Text.RegularExpressions;
class Program {
static void Main() {
string expression = "(UI$.SlNo-UI+UI$.Task)-(UI$.Responsible_Person*UI$.StartDate) ";
string replaced = Regex.Replace(expression, #"([\w\$\.]+)", " [ $1 ] ");
}
}
If you are not familiar with regular expressions this might look rather cryptic, but they are a powerful tool, and worth learning. In case, you may check how regular expressions work, and use a tool like Expresso to test your regular expressions.
Hope this helps...

Here is an algorithm that does not use regular expressions.
//Applies dobule spacing between characters
public static string DoubleSpace(string s)
{
if (string.IsNullOrEmpty(s))
{
return string.Empty;
}
char[] a = s.ToCharArray();
char[] b = new char[ (a.Length * 2) - 1];
int bIndex = 0;
for(int i = 0; i < a.Length; i++)
{
b[bIndex++] = a[i];
//Insert a white space after the char
if(i < (a.Length - 1))
{
b[bIndex++] = ' ';
}
}
return new string(b);
}

Well, you can do this by using Regular expressions, search for specific paterns and add brackets where needed. You could also simply Replace every Paranthesis with the same Paranthesis but with spaces on each end.
I would also advice you to use StringBuilder aswell instead of appending to an existing string (this creates a new string for each manipulation, StringBuilder has a smaller memory footprint when doing this kind of manipulation)

How to make a first letter capital in C#

How can the first letter in a text be set to capital?
Example:
it is a text. = It is a text.

public static string ToUpperFirstLetter(this string source)
{
if (string.IsNullOrEmpty(source))
return string.Empty;
// convert to char array of the string
char[] letters = source.ToCharArray();
// upper case the first char
letters[0] = char.ToUpper(letters[0]);
// return the array made of the new char array
return new string(letters);
}

It'll be something like this:
// precondition: before must not be an empty string
String after = before.Substring(0, 1).ToUpper() + before.Substring(1);

polygenelubricants' answer is fine for most cases, but you potentially need to think about cultural issues. Do you want this capitalized in a culture-invariant way, in the current culture, or a specific culture? It can make a big difference in Turkey, for example. So you may want to consider:
CultureInfo culture = ...;
text = char.ToUpper(text[0], culture) + text.Substring(1);
or if you prefer methods on String:
CultureInfo culture = ...;
text = text.Substring(0, 1).ToUpper(culture) + text.Substring(1);
where culture might be CultureInfo.InvariantCulture, or the current culture etc.
For more on this problem, see the Turkey Test.

If you are using C# then try this code:
Microsoft.VisualBasic.StrConv(sourceString, Microsoft.VisualBasic.vbProperCase)

I use this variant:
private string FirstLetterCapital(string str)
{
return Char.ToUpper(str[0]) + str.Remove(0, 1);
}

If you are sure that str variable is valid (never an empty-string or null), try:
str = Char.ToUpper(str[0]) + str[1..];
Unlike the other solutions that use Substring, this one does not do additional string allocations. This example basically concatenates char with ReadOnlySpan<char>.

I realize this is an old post, but I recently had this problem and solved it with the following method.
private string capSentences(string str)
{
string s = "";
if (str[str.Length - 1] == '.')
str = str.Remove(str.Length - 1, 1);
char[] delim = { '.' };
string[] tokens = str.Split(delim);
for (int i = 0; i < tokens.Length; i++)
{
tokens[i] = tokens[i].Trim();
tokens[i] = char.ToUpper(tokens[i][0]) + tokens[i].Substring(1);
s += tokens[i] + ". ";
}
return s;
}
In the sample below clicking on the button executes this simple code outBox.Text = capSentences(inBox.Text.Trim()); which pulls the text from the upper box and puts it in the lower box after the above method runs on it.

Take the first letter out of the word and then extract it to the other string.
strFirstLetter = strWord.Substring(0, 1).ToUpper();
strFullWord = strFirstLetter + strWord.Substring(1);

text = new String(
new [] { char.ToUpper(text.First()) }
.Concat(text.Skip(1))
.ToArray()
);

this functions makes capital the first letter of all words in a string
public static string FormatSentence(string source)
{
var words = source.Split(' ').Select(t => t.ToCharArray()).ToList();
words.ForEach(t =>
{
for (int i = 0; i < t.Length; i++)
{
t[i] = i.Equals(0) ? char.ToUpper(t[i]) : char.ToLower(t[i]);
}
});
return string.Join(" ", words.Select(t => new string(t)));;
}

string str = "it is a text";
// first use the .Trim() method to get rid of all the unnecessary space at the begining and the end for exemple (" This string ".Trim() is gonna output "This string").
str = str.Trim();
char theFirstLetter = str[0]; // this line is to take the first letter of the string at index 0.
theFirstLetter.ToUpper(); // .ToTupper() methode to uppercase the firstletter.
str = theFirstLetter + str.substring(1); // we add the first letter that we uppercased and add the rest of the string by using the str.substring(1) (str.substring(1) to skip the first letter at index 0 and only print the letters from the index 1 to the last index.)
Console.WriteLine(str); // now it should output "It is a text"

static String UppercaseWords(String BadName)
{
String FullName = "";
if (BadName != null)
{
String[] FullBadName = BadName.Split(' ');
foreach (string Name in FullBadName)
{
String SmallName = "";
if (Name.Length > 1)
{
SmallName = char.ToUpper(Name[0]) + Name.Substring(1).ToLower();
}
else
{
SmallName = Name.ToUpper();
}
FullName = FullName + " " + SmallName;
}
}
FullName = FullName.Trim();
FullName = FullName.TrimEnd();
FullName = FullName.TrimStart();
return FullName;
}

string Input = " it is my text";
Input = Input.TrimStart();
//Create a char array
char[] Letters = Input.ToCharArray();
//Make first letter a capital one
string First = char.ToUpper(Letters[0]).ToString();
//Concatenate
string Output = string.Concat(First,Input.Substring(1));

Try this code snippet:
char nm[] = "this is a test";
if(char.IsLower(nm[0])) nm[0] = char.ToUpper(nm[0]);
//print result: This is a test

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Remove specific characters except last - c#

Since you work with files, I suggest using Path class: all we want is to change file name only while keeping extension intact: static string replaceForUnderScore(string file) => Path.GetFileNameWithoutExtension(file).Replace('.', '_') + Path.GetExtension(file);

You can replace all the dots with an underscore except for the last dot by asserting that there is still a dot present to the right when matching one. string result = Regex.Replace(input, #"\.(?=[^.]*\.)", "_"); The result will be video_coffee_example.mp4

Off the top of my head but this might work. return $"{file.Replace(".mp4","").Replace(".","_")}.mp4";

Related

c# get the first ';' after parentheses

How do I remove multiple offending characters from my string? [duplicate]

Keep only numeric value from a string?

How to insert spaces between the characters of a string

How to make a first letter capital in C#

Categories

Resources