removing specific strings from the end of the words c# - c#

Can I do something like following to remove specific strings from the end of the words ?
public static HashSet<string> stringtoremove = new HashSet<string>
...............
.................
public static string stepone(this string word)
{
if (stringtoremove(word.EndsWith))
{
word = ..................................;
}
return word;
}
I tried but it doesn't work. did i miss something in my code ? thanks in advance.

The best option is to use Regular Expressions; have a look at the Replace method.
string input = "test testabc test123 abc abctest";
string pattern = #"(abc\b)";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("Replacement String: {0}", result);

I assume that you actually want to look into the HashSet<String> to see if the given string parameter ends with one of these words. If so, remove it from the end of the string.
You can use FirstOrDefault to determine the first string in the set that is also the end of the given word:
var firstMatch = stringtoremove.FirstOrDefault(str => word.EndsWith(str));
if (firstMatch != null)
return word.Substring(0, word.Length - firstMatch.Length);
else
return word;

Why don't you use String.TrimEnd method?
word = word.TrimEnd(charsToTrim)

Related

Regex Replace - Based on Char Input

Lets say we have string:
Hello
The user enters a char input "e"
What is the correct way of returning the string as the following using a regex method:
-e---
Code tried:
public static string updatedWord(char guess, string word)
{
string result = Regex.Replace(word, guess, "-");
console.writeline(result);
return result;
}
Assuming the input were e, you could build the following regex pattern:
[^e]
Then, do a global replacement on this pattern, which matches any single character which is not e, and replace it with a single dash.
string word = "Hello";
char guess = 'e';
string regex = "[^" + guess + "]";
string result = Regex.Replace(word, regex, "-");
Console.WriteLine(result);
This prints:
-e---
Note that to ensure that we handle regex metacharacters correctly, should they be allowed as inputs, we can wrap the regex pattern above in Regex.Escape:
Regex.Escape(regex)
This can be done without Regex, you need to "loop" all characters of the secret word and replace not yet guessed characters with -, regex will loop letters also, but c# methods are more comprehensible ;)
You need to keep collection of already guessed letters.
public class Guess
{
private readonly string _word;
private readonly HashSet<char> _guessed;
public Guess(string word)
{
_word = word;
_guessed = new HashSet<char>();
}
public string Try(char letter)
{
_guessed.Add(letter);
var maskedLetters = _word.Select(c => _guessed.Contains(c) ? c : '-').ToArray();
return new string(maskedLetters);
}
}
Usage
var game = new Guess("Hello");
var result = game.Try('e');
Console.WriteLine(result); // "-e---"

How can I eliminate a quote from the start of my string using regex?

I have strings that sometimes start like this:
"[1][v5r,vi][uk]
Other times like this:
[1][v5r,vi][uk]
How can I remove the " when it appears at the start of a string using Regex? I know I need to do something like this, but not sure how to set it up:
regex = new Regex(#"(\n )?\[ant=[^\]]*\]");
regex.Replace(item.JmdictMeaning, ""));
If the string always starts with [1]:
int indexOfFirstElement = item.IndexOf("[1]");
if (indexOfFirstElement > 0)
item = item.Substring(indexOfFirstElement);
If you just want to start at the first [:
int indexOfFirstElement = item.IndexOf('[');
if (indexOfFirstElement > 0)
item = item.Substring(indexOfFirstElement);
Simpler than Regex, which is probably overkill for this problem.
Here you go
string input =#" ""[1][v5r,vi][uk]";
string pattern = #"^\s*""?|""?\s*$";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, "");
Console.WriteLine(result);
You can find my Example here in dotnetfiddle
string.StartsWith will do the trick
string str = "\"[1][v5r,vi][uk]";
if(str.StartsWith('"'))
str = str.Substring(1);
It can be done using indexOf and Substring
string str = "\"a[1][v5r,vi][uk]";
Console.WriteLine(str.Substring(str.IndexOf('[')));
use TrimStart() to remove this character if exists
string str = "\"a[1][v5r,vi][uk]";
str= str.TrimStart('\"');

Split string and string arrays

string s= abc**xy**efg**xy**ijk123**xy**lmxno**xy**opq**xy**rstz;
I want the output as string array, where it get splits at "xy". I used
string[] lines = Regex.Split(s, "xy");
here it removes xy. I want array along with xy. So, after I split my string to string array, array should be as below.
lines[0]= abc;
lines[1]= xyefg;
lines[2]= xyijk123;
lines[3]= xylmxno;
lines[4]= xyopq ;
lines[5]= xyrstz;
how can i do this?
(?=xy)
You need to split on 0 width assertion.See demo.
https://regex101.com/r/fM9lY3/50
string strRegex = #"(?=xy)";
Regex myRegex = new Regex(strRegex, RegexOptions.None);
string strTargetString = #"abcxyefgxyijk123xylmxnoxyopqxyrstz";
return myRegex.Split(strTargetString);
Output:
abc
xyefg
xyijk123
xylmxno
xyopq
xyrstz
It seems fairly simple to do this:
string s = "abc**xy**efg**xy**ijk123**xy**lmxno**xy**opq**xy**rstz";
string[] lines = Regex.Split(s, "xy");
lines = lines.Take(1).Concat(lines.Skip(1).Select(l => "xy" + l)).ToArray();
I get the following result:
I don't know if you wanted to keep the ** - your question doesn't make it clear. Changing the RegEx to #"\*\*xy\*\*" will remove the **.
If you're not married to Regex, you could make your own extension method:
public static IEnumerable<string> Ssplit(this string InputString, string Delimiter)
{
int idx = InputString.IndexOf(Delimiter);
while (idx != -1)
{
yield return InputString.Substring(0, idx);
InputString = InputString.Substring(idx);
idx = InputString.IndexOf(Delimiter, Delimiter.Length);
}
yield return InputString;
}
Usage:
string s = "abc**xy**efg**xy**ijk123**xy**lmxno**xy**opq**xy**rstz";
var x = s.Ssplit("xy");
How about simply looping throgh the array starting with index 1 and adding the "xy" string to each entry?
Alternatively implement your own version of split that cuts the string how you want it.
Yeat another solution would be matching "xy*" in a non-greedy way and your array would be the list of all matches. Depending on language this probably won't be called split BTW.

replacing characters in a single field of a comma-separated list

I have string in my c# code
a,b,c,d,"e,f",g,h
I want to replace "e,f" with "e f" i.e. ',' which is inside inverted comma should be replaced by space.
I tried using string.split but it is not working for me.
OK, I can't be bothered to think of a regex approach so I am going to offer an old fashioned loop approach which will work:
string DoReplace(string input)
{
bool isInner = false;//flag to detect if we are in the inner string or not
string result = "";//result to return
foreach(char c in input)//loop each character in the input string
{
if(isInner && c == ',')//if we are in an inner string and it is a comma, append space
result += " ";
else//otherwise append the character
result += c;
if(c == '"')//if we have hit an inner quote, toggle the flag
isInner = !isInner;
}
return result;
}
NOTE: This solution assumes that there can only be one level of inner quotes, for example you cannot have "a,b,c,"d,e,"f,g",h",i,j" - because that's just plain madness!
For the scenario where you only need to match one pair of letters, the following regex will work:
string source = "a,b,c,d,\"e,f\",g,h";
string pattern = "\"([\\w]),([\\w])\"";
string replace = "\"$1 $2\"";
string result = Regex.Replace(source, pattern, replace);
Console.WriteLine(result); // a,b,c,d,"e f",g,h
Breaking apart the pattern, it is matching any instance where there is a "X,X" sequence where X is any letter, and is replacing it with the very same sequence, with a space in between the letters instead of a comma.
You could easily extend this if you needed to to have it match more than one letter, etc, as needed.
For the case where you can have multiple letters separated by commas within quotes that need to be replaced, the following can do it for you. Sample text is a,b,c,d,"e,f,a",g,h:
string source = "a,b,c,d,\"e,f,a\",g,h";
string pattern = "\"([ ,\\w]+),([ ,\\w]+)\"";
string replace = "\"$1 $2\"";
string result = source;
while (Regex.IsMatch(result, pattern)) {
result = Regex.Replace(result, pattern, replace);
}
Console.WriteLine(result); // a,b,c,d,"e f a",g,h
This does something similar compared to the first one, but just removes any comma that is sandwiched by letters surrounded by quotes, and repeats it until all cases are removed.
Here's a somewhat fragile but simple solution:
string.Join("\"", line.Split('"').Select((s, i) => i % 2 == 0 ? s : s.Replace(",", " ")))
It's fragile because it doesn't handle flavors of CSV that escape double-quotes inside double-quotes.
Use the following code:
string str = "a,b,c,d,\"e,f\",g,h";
string[] str2 = str.Split('\"');
var str3 = str2.Select(p => ((p.StartsWith(",") || p.EndsWith(",")) ? p : p.Replace(',', ' '))).ToList();
str = string.Join("", str3);
Use Split() and Join():
string input = "a,b,c,d,\"e,f\",g,h";
string[] pieces = input.Split('"');
for ( int i = 1; i < pieces.Length; i += 2 )
{
pieces[i] = string.Join(" ", pieces[i].Split(','));
}
string output = string.Join("\"", pieces);
Console.WriteLine(output);
// output: a,b,c,d,"e f",g,h

How do I capitalize first letter of first name and last name in C#?

Is there an easy way to capitalize the first letter of a string and lower the rest of it? Is there a built in method or do I need to make my own?
TextInfo.ToTitleCase() capitalizes the first character in each token of a string.
If there is no need to maintain Acronym Uppercasing, then you should include ToLower().
string s = "JOHN DOE";
s = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(s.ToLower());
// Produces "John Doe"
If CurrentCulture is unavailable, use:
string s = "JOHN DOE";
s = new System.Globalization.CultureInfo("en-US", false).TextInfo.ToTitleCase(s.ToLower());
See the MSDN Link for a detailed description.
CultureInfo.CurrentCulture.TextInfo.ToTitleCase("hello world");
String test = "HELLO HOW ARE YOU";
string s = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(test);
The above code wont work .....
so put the below code by convert to lower then apply the function
String test = "HELLO HOW ARE YOU";
string s = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(test.ToLower());
There are some cases that CultureInfo.CurrentCulture.TextInfo.ToTitleCase cannot handle, for example : the apostrophe '.
string input = CultureInfo.CurrentCulture.TextInfo.ToTitleCase("o'reilly, m'grego, d'angelo");
// input = O'reilly, M'grego, D'angelo
A regex can also be used \b[a-zA-Z] to identify the starting character of a word after a word boundary \b, then we need just to replace the match by its upper case equivalence thanks to the Regex.Replace(string input,string pattern,MatchEvaluator evaluator) method :
string input = "o'reilly, m'grego, d'angelo";
input = Regex.Replace(input.ToLower(), #"\b[a-zA-Z]", m => m.Value.ToUpper());
// input = O'Reilly, M'Grego, D'Angelo
The regex can be tuned if needed, for instance, if we want to handle the MacDonald and McFry cases the regex becomes : (?<=\b(?:mc|mac)?)[a-zA-Z]
string input = "o'reilly, m'grego, d'angelo, macdonald's, mcfry";
input = Regex.Replace(input.ToLower(), #"(?<=\b(?:mc|mac)?)[a-zA-Z]", m => m.Value.ToUpper());
// input = O'Reilly, M'Grego, D'Angelo, MacDonald'S, McFry
If we need to handle more prefixes we only need to modify the group (?:mc|mac), for example to add french prefixes du, de : (?:mc|mac|du|de).
Finally, we can realize that this regex will also match the case MacDonald'S for the last 's so we need to handle it in the regex with a negative look behind (?<!'s\b). At the end we have :
string input = "o'reilly, m'grego, d'angelo, macdonald's, mcfry";
input = Regex.Replace(input.ToLower(), #"(?<=\b(?:mc|mac)?)[a-zA-Z](?<!'s\b)", m => m.Value.ToUpper());
// input = O'Reilly, M'Grego, D'Angelo, MacDonald's, McFry
Mc and Mac are common surname prefixes throughout the US, and there are others. TextInfo.ToTitleCase doesn't handle those cases and shouldn't be used for this purpose. Here's how I'm doing it:
public static string ToTitleCase(string str)
{
string result = str;
if (!string.IsNullOrEmpty(str))
{
var words = str.Split(' ');
for (int index = 0; index < words.Length; index++)
{
var s = words[index];
if (s.Length > 0)
{
words[index] = s[0].ToString().ToUpper() + s.Substring(1);
}
}
result = string.Join(" ", words);
}
return result;
}
ToTitleCase() should work for you.
http://support.microsoft.com/kb/312890
The most direct option is going to be to use the ToTitleCase function that is available in .NET which should take care of the name most of the time. As edg pointed out there are some names that it will not work for, but these are fairly rare so unless you are targeting a culture where such names are common it is not necessary something that you have to worry too much about.
However if you are not working with a .NET langauge, then it depends on what the input looks like - if you have two separate fields for the first name and the last name then you can just capitalize the first letter lower the rest of it using substrings.
firstName = firstName.Substring(0, 1).ToUpper() + firstName.Substring(1).ToLower();
lastName = lastName.Substring(0, 1).ToUpper() + lastName.Substring(1).ToLower();
However, if you are provided multiple names as part of the same string then you need to know how you are getting the information and split it accordingly. So if you are getting a name like "John Doe" you an split the string based upon the space character. If it is in a format such as "Doe, John" you are going to need to split it based upon the comma. However, once you have it split apart you just apply the code shown previously.
CultureInfo.CurrentCulture.TextInfo.ToTitleCase ("my name");
returns ~ My Name
But the problem still exists with names like McFly as stated earlier.
I use my own method to get this fixed:
For example the phrase: "hello world. hello this is the stackoverflow world." will be "Hello World. Hello This Is The Stackoverflow World.". Regex \b (start of a word) \w (first charactor of the word) will do the trick.
/// <summary>
/// Makes each first letter of a word uppercase. The rest will be lowercase
/// </summary>
/// <param name="Phrase"></param>
/// <returns></returns>
public static string FormatWordsWithFirstCapital(string Phrase)
{
MatchCollection Matches = Regex.Matches(Phrase, "\\b\\w");
Phrase = Phrase.ToLower();
foreach (Match Match in Matches)
Phrase = Phrase.Remove(Match.Index, 1).Insert(Match.Index, Match.Value.ToUpper());
return Phrase;
}
The suggestions to use ToTitleCase won't work for strings that are all upper case. So you are gonna have to call ToUpper on the first char and ToLower on the remaining characters.
This class does the trick. You can add new prefixes to the _prefixes static string array.
public static class StringExtensions
{
public static string ToProperCase( this string original )
{
if( String.IsNullOrEmpty( original ) )
return original;
string result = _properNameRx.Replace( original.ToLower( CultureInfo.CurrentCulture ), HandleWord );
return result;
}
public static string WordToProperCase( this string word )
{
if( String.IsNullOrEmpty( word ) )
return word;
if( word.Length > 1 )
return Char.ToUpper( word[0], CultureInfo.CurrentCulture ) + word.Substring( 1 );
return word.ToUpper( CultureInfo.CurrentCulture );
}
private static readonly Regex _properNameRx = new Regex( #"\b(\w+)\b" );
private static readonly string[] _prefixes = {
"mc"
};
private static string HandleWord( Match m )
{
string word = m.Groups[1].Value;
foreach( string prefix in _prefixes )
{
if( word.StartsWith( prefix, StringComparison.CurrentCultureIgnoreCase ) )
return prefix.WordToProperCase() + word.Substring( prefix.Length ).WordToProperCase();
}
return word.WordToProperCase();
}
}
If your using vS2k8, you can use an extension method to add it to the String class:
public static string FirstLetterToUpper(this String input)
{
return input = input.Substring(0, 1).ToUpper() +
input.Substring(1, input.Length - 1);
}
To get round some of the issues/problems that have ben highlighted I would suggest converting the string to lower case first and then call the ToTitleCase method. You could then use IndexOf(" Mc") or IndexOf(" O\'") to determine special cases that need more specific attention.
inputString = inputString.ToLower();
inputString = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(inputString);
int indexOfMc = inputString.IndexOf(" Mc");
if(indexOfMc > 0)
{
inputString.Substring(0, indexOfMc + 3) + inputString[indexOfMc + 3].ToString().ToUpper() + inputString.Substring(indexOfMc + 4);
}
I like this way:
using System.Globalization;
...
TextInfo myTi = new CultureInfo("en-Us",false).TextInfo;
string raw = "THIS IS ALL CAPS";
string firstCapOnly = myTi.ToTitleCase(raw.ToLower());
Lifted from this MSDN article.
Hope this helps you.
String fName = "firstname";
String lName = "lastname";
String capitalizedFName = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(fName);
String capitalizedLName = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(lName);
public static string ConvertToCaptilize(string input)
{
if (!string.IsNullOrEmpty(input))
{
string[] arrUserInput = input.Split(' ');
// Initialize a string builder object for the output
StringBuilder sbOutPut = new StringBuilder();
// Loop thru each character in the string array
foreach (string str in arrUserInput)
{
if (!string.IsNullOrEmpty(str))
{
var charArray = str.ToCharArray();
int k = 0;
foreach (var cr in charArray)
{
char c;
c = k == 0 ? char.ToUpper(cr) : char.ToLower(cr);
sbOutPut.Append(c);
k++;
}
}
sbOutPut.Append(" ");
}
return sbOutPut.ToString();
}
return string.Empty;
}
Like edg indicated, you'll need a more complex algorithm to handle special names (this is probably why many places force everything to upper case).
Something like this untested c# should handle the simple case you requested:
public string SentenceCase(string input)
{
return input(0, 1).ToUpper + input.Substring(1).ToLower;
}

Categories

Resources