Using string.ToUpper on substring

Using string.ToUpper on substring - c#

Have an assignment to allow a user to input a word in C# and then display that word with the first and third characters changed to uppercase. Code follows:
namespace Capitalizer
{
class Program
{
static void Main(string[] args)
{
string text = Console.ReadLine();
char[] delimiterChars = { ' ' };
string[] words = text.Split(delimiterChars);
string Upper = text.ToUpper();
Console.WriteLine(Upper);
Console.ReadKey();
}
}
}
This of course generates the entire word in uppercase, which is not what I want. I can't seem to make text.ToUpper(0,2) work, and even then that'd capitalize the first three letters. Only solution I can think of now that would make the word appear on one line (and I don't know if it works) is to move the capitalized letters and lowercase letters into a character array and try to get that to print all values in a modified order.

The simplest way I can think of to address your exact question as described — to convert to upper case the first and third characters of the input — would be something like the following:
StringBuilder sb = new StringBuilder(text);
sb[0] = char.ToUpper(sb[0]);
sb[2] = char.ToUpper(sb[2]);
text = sb.ToString();
The StringBuilder class is essentially a mutable string object, so when doing these kinds of operations is the most fluid way to approach the problem, as it provides the most straightforward conversions to and from, as well as the full range of string operations. Changing individual characters is easy in many data structures, but insertions, deletions, appending, formatting, etc. all also come with StringBuilder, so it's a good habit to use that versus other approaches.
But frankly, it's hard to see how that's a useful operation. I can't help but wonder if you have stated the requirements incorrectly and there's something more to this question than is seen here.

You could use LINQ:
var upperCaseIndices = new[] { 0, 2 };
var message = "hello";
var newMessage = new string(message.Select((c, i) =>
upperCaseIndices.Contains(i) ? Char.ToUpper(c) : c).ToArray());
Here is how it works. message.Select (inline LINQ query) selects characters from message one by one and passes into selector function:
upperCaseIndices.Contains(i) ? Char.ToUpper(c) : c
written as C# ?: shorthand syntax for if. It reads as "If index is present in the array, then select upper case character. Otherwise select character as is."
(c, i) => condition
is a lambda expression. See also:
Understand Lambda Expressions in 3 minutes
The rest is very simple - represent result as array of characters (.ToArray()), and create a new string based off that (new string(...)).

Only solution I can think of now that would make the word appear on one line (and I don't know if it works) is to move the capitalized letters and lowercase letters into a character array and try to get that to print all values in a modified order.
That seems a lot more complicated than necessary. Once you have a character array, you can simply change the elements of that character array. In a separate function, it would look something like
string MakeFirstAndThirdCharacterUppercase(string word) {
var chars = word.ToCharArray();
chars[0] = chars[0].ToUpper();
chars[2] = chars[2].ToUpper();
return new string(chars);
}

My simple solution:
string text = Console.ReadLine();
char[] delimiterChars = { ' ' };
string[] words = text.Split(delimiterChars);
foreach (string s in words)
{
char[] chars = s.ToCharArray();
chars[0] = char.ToUpper(chars[0]);
if (chars.Length > 2)
{
chars[2] = char.ToUpper(chars[2]);
}
Console.Write(new string(chars));
Console.Write(' ');
}
Console.ReadKey();

Related

convert non alphanumeric glyphs to unicode while preserving alphanumeric

I need to convert non alpha-numeric glyphs in a string to their unicode value, while preserving the alphanumeric characters. Is there a method to do this in C#?
As an example, I need to convert this string:
"hello world!"
To this:
"hello_x0020_world_x0021_"

To get string safe for XML node name you should use XmlConver.EncodeName.
Note that if you need to encode all non-alphanumeric characters you'd need to write it yourself as "_" is not encoded by that method.

You could start with this code using LINQ Select extension method:
string str = "hello world!";
string a = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
a += a.ToLower();
char[] alphabet = a.ToCharArray();
str = string.Join("",
str.Select(ch => alphabet.Contains(ch) ?
ch.ToString() : String.Format("_x{0:x4}_", ch)).ToArray()
);
Now clearly it has some problems:
it does linear search in the list of characters
missed numeric...
if we add numeric need to decide if first character is ok to be digit (assuming yes)
code creates large number of strings that are immediately discarded (one per character)
alphanumeric is limited to ASCII (assuming ok, if not Char.IsLetterOrDigit to help)
does to much work for pure alpha-numeric strings
First two are easy - we can use HashSet (O(1) Contains) initialized by full list of characters (if any alpahnumeric characters are ok more readable to use existing method - Char.IsLetterOrDigit):
public static HashSet<char> asciiAlphaNum = new HashSet<char>
("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789");
To avoid ch.ToString() that really pointlessly produces strings for immediate GC we need to figure out how to construct string from mix of char and string. String.Join does not work because it wants strings to start with, regular new string(...) does not have option for mix of char and string. So we are left with StringBuilder that happily takes both to Append. Consider starting with initial size str.Length if most strings don't have other characters.
So for each character we just need to either builder.Append(ch) or builder.AppendFormat(("_x{0:x4}_", (int)ch). To perform iteration it is easier to just use regular foreach, but if one really wants LINQ - Enumerable.Aggregate is the way to go.
string ReplaceNonAlphaNum(string str)
{
var builder = new StringBuilder();
foreach (var ch in str)
{
if (asciiAlphaNum.Contains(ch))
builder.Append(ch);
else
builder.AppendFormat("_x{0:x4}_", (int)ch);
}
return builder.ToString();
}
string ReplaceNonAlphaNumLinq(string str)
{
return str.Aggregate(new StringBuilder(), (builder, ch) =>
asciiAlphaNum.Contains(ch) ?
builder.Append(ch) : builder.AppendFormat("_x{0:x4}_", (int)ch)
).ToString();
}
To the last point - we don't really need to do anything if there is nothing to convert - so some check like check alphanumeric characters in string in c# would help to avoid extra strings.
Thus final version (LINQ as it is a bit shorter and fancier):
private static asciiAlphaNumRx = new Regex(#"^[a-zA-Z0-9]*$");
public static HashSet<char> asciiAlphaNum = new HashSet<char>
("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789");
string ReplaceNonAlphaNumLinq(string str)
{
return asciiAlphaNumRx.IsMatch(str) ? str :
str.Aggregate(new StringBuilder(), (builder, ch) =>
asciiAlphaNum.Contains(ch) ?
builder.Append(ch) : builder.AppendFormat("_x{0:x4}_", (int)ch)
).ToString();
}
Alternatively whole thing could be done with Regex - see Regex replace: Transform pattern with a custom function for starting point.

Check array for string that starts with given one (ignoring case)

I am trying to see if my string starts with a string in an array of strings I've created. Here is my code:
string x = "Table a";
string y = "a table";
string[] arr = new string["table", "chair", "plate"]
if (arr.Contains(x.ToLower())){
// this should be true
}
if (arr.Contains(y.ToLower())){
// this should be false
}
How can I make it so my if statement comes up true? Id like to just match the beginning of string x to the contents of the array while ignoring the case and the following characters. I thought I needed regex to do this but I could be mistaken. I'm a bit of a newbie with regex.

It seems you want to check if your string contains an element from your list, so this should be what you are looking for:
if (arr.Any(c => x.ToLower().Contains(c)))
Or simpler:
if (arr.Any(x.ToLower().Contains))
Or based on your comments you may use this:
if (arr.Any(x.ToLower().Split(' ')[0].Contains))

Because you said you want regex...
you can set a regex to var regex = new Regex("(table|plate|fork)");
and check for if(regex.IsMatch(myString)) { ... }
but it for the issue at hand, you dont have to use Regex, as you are searching for an exact substring... you can use
(as #S.Akbari mentioned : if (arr.Any(c => x.ToLower().Contains(c))) { ... }

Enumerable.Contains matches exact values (and there is no build in compare that checks for "starts with"), you need Any that takes predicate that takes each array element as parameter and perform the check. So first step is you want "contains" to be other way around - given string to contain element from array like:
var myString = "some string"
if (arr.Any(arrayItem => myString.Contains(arrayItem)))...
Now you actually asking for "string starts with given word" and not just contains - so you obviously need StartsWith (which conveniently allows to specify case sensitivity unlike Contains - Case insensitive 'Contains(string)'):
if (arr.Any(arrayItem => myString.StartsWith(
arrayItem, StringComparison.CurrentCultureIgnoreCase))) ...
Note that this code will accept "tableAAA bob" - if you really need to break on word boundary regular expression may be better choice. Building regular expressions dynamically is trivial as long as you properly escape all the values.
Regex should be
beginning of string - ^
properly escaped word you are searching for - Escape Special Character in Regex
word break - \b
if (arr.Any(arrayItem => Regex.Match(myString,
String.Format(#"^{0}\b", Regex.Escape(arrayItem)),
RegexOptions.IgnoreCase)) ...

you can do something like below using TypeScript. Instead of Starts with you can also use contains or equals etc..
public namesList: Array<string> = ['name1','name2','name3','name4','name5'];
// SomeString = 'name1, Hello there';
private isNamePresent(SomeString : string):boolean{
if (this.namesList.find(name => SomeString.startsWith(name)))
return true;
return false;
}

I think I understand what you are trying to say here, although there are still some ambiguity. Are you trying to see if 1 word in your String (which is a sentence) exists in your array?
#Amy is correct, this might not have to do with Regex at all.
I think this segment of code will do what you want in Java (which can easily be translated to C#):
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
foreach(string word in words){
foreach(string element in arr){
if(element.Equals(word)){
return true;
}
}
}
return false;
You can also use a Set to store the elements in your array, which can make look up more efficient.
Java:
x = x.ToLower();
string[] words = x.Split("\\s+");
HashSet<string> set = new HashSet<string>(arr);
for(string word : words){
if(set.contains(word)){
return true;
}
}
return false;
Edit: (12/22, 11:05am)
I rewrote my solution in C#, thanks to reminders by #Amy and #JohnyL. Since the author only wants to match the first word of the string, this edited code should work :)
C#:
static bool contains(){
x = x.ToLower();
string[] words = x.Split(" ");
var set = new HashSet<string>(arr);
if(set.Contains(words[0])){
return true;
}
return false;
}

Sorry my question was so vague but here is the solution thanks to some help from a few people that answered.
var regex = new Regex("^(table|chair|plate) *.*");
if (regex.IsMatch(x.ToLower())){}

Reversing a string sentence (not necessarily space between each)

I have this string:
com.example.is-this#myname
i would like it to be
myname#this-is.example.com
using .Net, but a straight out concept or an idea would be good to.
What i'm currently doing is going over each character, find out if it's one of the "special characters" and assign all prior chars, to a variable of an array, at the end, i'm joining them all together from last to first.
is there a possible more efficient way to do this ?

This is the classic word-by-word reversal, with a small twist on delimiters. A solution to this problem is reversing each word individually, and then reversing the whole string. Do not touch delimiters when reversing words.
First step goes as follows: we find limits of each token, and reverse it in place, like this:
com.example.is-this#myname
moc.example.is-this#myname
moc.elpmaxe.is-this#myname
moc.elpmaxe.si-this#myname
moc.elpmaxe.si-siht#myname
moc.elpmaxe.si-siht#emanym
Reverse the result to get your desired output:
moc.elpmaxe.si-siht#emanym -> myname#this-is.example.com
As far as the implementation goes, you can do it by converting the string to an array of characters to make it changeable in place, and write a short helper method that lets you reverse a portion of a char array between indexes i and j. With this helper method in place, all you need to do is to find delimiters and call the helper for each delimited word, and then make one final call to reverse the entire sentence.

With little bit of Regex and Linq this is fairly simple.
Idea is that we take words and non word characters as separate token with Regex patten. Then, we just reverse it and join it.
var tokens = Regex.Matches("com.example.is-this#myname", #"\w+|\W")
.Cast<Match>()
.Select(x=>x.Value)
.Reverse();
string reversed = string.Concat(tokens);
Output: Ideone - Demo
myname#this-is.example.com

You could use the Split C# method.
The example below is from here.
using System;
class Program
{
static void Main()
{
string s = "there is a cat";
// Split string on spaces.
// ... This will separate all the words.
string[] words = s.Split(' ');
foreach (string word in words)
{
Console.WriteLine(word);
}
}
}
Is as simple as examples get.
Then you add more conditions to your Split()
string [] split = strings .Split(new Char [] {'.' , '#', '-' },
StringSplitOptions.RemoveEmptyEntries);
The RemoveEmptyEntries just removes unwanted empty entries to your array.
After that you reverse your array using the Array.Reverse method.
And then you can stitch your string back together with a Foreach loop.
As #marjan-venema mentioned in the comments you could populate a parallel array at this point with each delimiter. Reverse it, and then concatenate the string when you are using the Foreach loop at each entry.

Here's another way to do it using a List, which has a handy Insert method, as well as a Reverse method.
The Insert method lets you continue to add characters to the same index in the list (and the others after it are moved to higher indexes).
So, as you read the original string, you can keep inserting the characters at the start. Once you come to a delimeter, you add it to the end and adjust your insert position to be right after the delimeter.
When you're done, you just call Reverse and join the characters back to a string:
public static string ReverseWords(string words, char[] wordDelimeters)
{
var reversed = new List<char>();
int insertPosition = 0;
for(int i = 0; i < words.Length; i++)
{
var character = words[i];
if (wordDelimeters.Contains(character))
{
reversed.Add(character);
insertPosition = i + 1;
continue;
}
reversed.Insert(insertPosition, character);
}
reversed.Reverse();
return string.Join("", reversed);
}

Shortcut for splitting only once in C#?

Okay, lets say I have a string:
string text = "one|two|three";
If I do string[] texts = text.Split('|'); I will end up with a string array of three objects. However, this isn't what I want. What I actually want is to split the string only once... so the two arrays I could would be this:
one
two|three
Additionally, is there a way to do a single split with the last occurrence in a string? So I get:
one|two
three
As well, is there a way to split by a string, instead of a character? So I could do Split("||")

Split method takes a count as parameter, you can pass 2 in that position, which basically says that you're interested in only 2 elements maximum. You'll get the expected result.
For second question: There is no built in way AFAIK. You may need to implement it yourself by splitting all and joining first and second back.

C#'s String.Split() can take a second argument that can define the number of elements to return:
string[] texts = text.Split(new char[] { '|' }, 2);

For your first scenario, you can pass a parameter of how many strings to split into.
var text = "one|two|three";
var result = text.Split(new char[] { '|' }, 2);
Your second scenario requires a little more magic.
var text = "one|two|three";
var list = text.Split('|');
var result = new string[] { string.Join("|", list, 0, list.Length - 1), list[list.Length - 1] };
Code has not been verified to check results before using.

Well, I took it as a challenge to do your second one in one line. The result is... not pretty, mostly because it's surprisingly difficult to reverse a string and keep it as a string.
string text = "one|two|three";
var result = new String(text.Reverse().ToArray()).Split(new char[] {'|'}, 2).Reverse().Select(c => new String(c.Reverse().ToArray()));
Basically, you reverse it, then follow the same procedure as the first one, then reverse each individual one, as well as the resulting array.

You can simply do like this as well...
//To split at first occurence of '|'
if(text.Containts('|')){
beginning = text.subString(0,text.IndexOf('|'));
ending = text.subString(text.IndexOf('|');
}
//To split at last occurence of '|'
if(text.Contains('|')){
beginning = text.subString(0,text.LastIndexOf('|'));
ending = text.subString(text.LastIndexOf('|');
}

Second question was fun. I solved it this way:
string text = "one|two|three";
var result =
new []
{
string.Concat(text.ToCharArray().TakeWhile((c, i) => i <= text.LastIndexOf("|"))),
string.Concat(text.ToCharArray().SkipWhile((c, i) => i <= text.LastIndexOf("|")))
};

Does C# have a String Tokenizer like Java's?

I'm doing simple string input parsing and I am in need of a string tokenizer. I am new to C# but have programmed Java, and it seems natural that C# should have a string tokenizer. Does it? Where is it? How do I use it?

You could use String.Split method.
class ExampleClass
{
public ExampleClass()
{
string exampleString = "there is a cat";
// Split string on spaces. This will separate all the words in a string
string[] words = exampleString.Split(' ');
foreach (string word in words)
{
Console.WriteLine(word);
// there
// is
// a
// cat
}
}
}
For more information see Sam Allen's article about splitting strings in c# (Performance, Regex)

I just want to highlight the power of C#'s Split method and give a more detailed comparison, particularly from someone who comes from a Java background.
Whereas StringTokenizer in Java only allows a single delimiter, we can actually split on multiple delimiters making regular expressions less necessary (although if one needs regex, use regex by all means!) Take for example this:
str.Split(new char[] { ' ', '.', '?' })
This splits on three different delimiters returning an array of tokens. We can also remove empty arrays with what would be a second parameter for the above example:
str.Split(new char[] { ' ', '.', '?' }, StringSplitOptions.RemoveEmptyEntries)
One thing Java's String tokenizer does have that I believe C# is lacking (at least Java 7 has this feature) is the ability to keep the delimiter(s) as tokens. C#'s Split will discard the tokens. This could be important in say some NLP applications, but for more general purpose applications this might not be a problem.

The split method of a string is what you need. In fact the tokenizer class in Java is deprecated in favor of Java's string split method.

I think the nearest in the .NET Framework is
string.Split()

For complex splitting you could use a regex creating a match collection.

_words = new List<string>(YourText.ToLower().Trim('\n', '\r').Split(' ').
Select(x => new string(x.Where(Char.IsLetter).ToArray())));
Or
_words = new List<string>(YourText.Trim('\n', '\r').Split(' ').
Select(x => new string(x.Where(Char.IsLetterOrDigit).ToArray())));

The similar to Java's method is:
Regex.Split(string, pattern);
where
string - the text you need to split
pattern - string type pattern, what is splitting the text

use Regex.Split(string,"#|#");

read this, split function has an overload takes an array consist of seperators
http://msdn.microsoft.com/en-us/library/system.stringsplitoptions.aspx

If you're trying to do something like splitting command line arguments in a .NET Console app, you're going to have issues because .NET is either broken or is trying to be clever (which means it's as good as broken). I needed to be able to split arguments by the space character, preserving any literals that were quoted so they didn't get split in the middle. This is the code I wrote to do the job:
private static List<String> Tokenise(string value, char seperator)
{
List<string> result = new List<string>();
value = value.Replace(" ", " ").Replace(" ", " ").Trim();
StringBuilder sb = new StringBuilder();
bool insideQuote = false;
foreach(char c in value.ToCharArray())
{
if(c == '"')
{
insideQuote = !insideQuote;
}
if((c == seperator) && !insideQuote)
{
if (sb.ToString().Trim().Length > 0)
{
result.Add(sb.ToString().Trim());
sb.Clear();
}
}
else
{
sb.Append(c);
}
}
if (sb.ToString().Trim().Length > 0)
{
result.Add(sb.ToString().Trim());
}
return result;
}

If you are using C# 3.5 you could write an extension method to System.String that does the splitting you need. You then can then use syntax:
string.SplitByMyTokens();
More info and a useful example from MS here http://msdn.microsoft.com/en-us/library/bb383977.aspx

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Using string.ToUpper on substring - c#

Related

convert non alphanumeric glyphs to unicode while preserving alphanumeric

Check array for string that starts with given one (ignoring case)

Reversing a string sentence (not necessarily space between each)

Shortcut for splitting only once in C#?

Does C# have a String Tokenizer like Java's?

Categories

Resources