Using string.Substring() as part of a chain - c#

I'm trying to maniplulate a string without making a big issue out of it and spreading it out onto multiple lines, so I'm using some chaining to achieve this. The question I have is, how do I use string.Substring() to drop the last character off my string in this context?
In PHP I can pass a negative number as an argument (i.e. substr(-1)) to achieve this, but obviously this isn't how C# works.
mystring = mystring.Replace('_', ' ').Substring(???);
Also, what is the actual name for the technique used above? I always referred to it as a callback chain, but a callback chain I now think is something completely different.
Please note I want to avoid:
mystring = mystring.Replace('_', ' ');
mystring = mystring.Substring(0, mystring.Length - 1);
Thanks in advance for your time and kind consideration.
Iain
Thanks for your answers guys. It's funny that people can have such strong opinions about string manipulation and other "competing" languages :)

You could write an Extension method RightStrip(). You can't overload SubString for negative start positions.
static string RightStrip(this string s, int n)
{
return s.Substring(0, s.Length - n);
}
string s = "Hello World!";
s = s.Replace('e', 'a').RightStrip(1);

Create an extension class like this:
public static class MyStringExtensions
{
public static string RemoveCharactersFromEnd(this string s, int n)
{
string result = string.Empty;
if (string.IsNullOrEmpty(s) == false && n > 0)
{
result = s.Remove(s.Length - n, n);
}
return result;
}
}
Call it:
Console.WriteLine("test!!".RemoveCharactersFromEnd(2));

In your sample, you are chaining to a method that doesn't change the length of the original string. Hence answers suggesting using SubString with (originalLength-1), which of course doesn't work in the general case.
The answer as you seem to have realized is - you can't do it in the general case, where previous methods in the chain have modified the length.
But you can write your own extension method in 3.5 to do what you want. Something like the following or a variant thereof:
public static string PhpSubstring(this string value, int length)
{
if (length < 0) length = value.Length - length;
return String.Substring(value, length);
}

Besides everyone else mentioning the term method chaining, or what some call a fluent interface, I had a note or two I wanted to add.
What I wanted to suggest is that the cool thing about extension methods is that you can easily define your own type of transformation functions that feel the same as this, including system methods such as Replace and ToLower, etc.... something that takes some input and returns some kind of transformed string.
The particular transformation you are asking for (cut off the right-most char) might seem clunky if you have to use Substring directly, but you can hide this away neatly in something like:
public string CutOff(this string s, int c)
{
return s.Substring(0, s.Length - c);
}
...
return myVal.CutOff(1);
(or at least, i think this should work!)
Best of luck!

Method chaining is the term you're looking for. It's true that you cannot pass a negative character like that, but you can still chain the methods:
mystring = mystring.Replace('_', ' ').Substring(0, mystring.Length - 1);
since the string replace in this case, does not affect the length of the string

mystring = mystring.Replace('_', ' ').Remove(mystring.length -1)
However I would consider this a bad idea since the assignment of mystring doesn't happen until after all the manipulation and change in the length of the string in previous calls will result in unexpected behavior.

To further Konamiman's comment:
Just because PHP allows bizarre (and frankly dirty and dangerous) overloads and parameters such as negative starts and counts in SubString, it doesn't mean it's the right, correct or proper way of doing it.
Substring(0, mystring.Length - 1) is the de facto way of trimming off the last character of a string in a wide variety of languages.

You could always use regex:
mystring = new Regex("^(.*).$").Match(mystring.Replace('_', ' ')).Groups[1].Value;
Also, since you're just going to remove that last character, it does not matter if it was a '_' that got replaced by a ' '. This would work just fine:
mystring = mystring.Substring(0, mystring.Length - 1).Replace('_', ' ');

Related

Removing punctuation from an extremely long string

I'm working on a book encryption program for one of my courses and I've run into a problem. Our professor gave us the example of using say Pride and Prejudice as the book used to encrypt, so I chose that one to test my program. The current function I'm using to remove the punctuation from the string is taking so long that the program is being forced into break mode. This function works for smaller strings even pages long, but when I fed it Pride and Prejudice it takes way to long.
public void removePunctuation(ref string s) {
string result = "";
for (int i = 0; i < s.Length; i++) {
if (Char.IsWhiteSpace(s[i])) {
result += ' ';
} else if (!Char.IsLetter(s[i]) && !Char.IsNumber(s[i])) {
// do nothing
} else {
result += s[i];
}
}
s = result;
}
So I think I need a faster way to remove punctuation from this string if anyone has any suggestions? I know looping through every character is horrible, but I'm stumped and I was never taught Regex in depth.
Edit: I was asked how I was storing the string in the dictionary class! This is the constructor for another class that actually uses the formatted string.
public CodeBook(string book)
{
BookMap = new Dictionary<string, List<int>>();
Key = book.Split(null).ToList(); // split string into words
foreach(string s in Key)
{
if (!BookMap.Keys.Contains(s))
{
BookMap.Add(s, Enumerable.Range(0, Key.Count).Where(i => Key[i] == s).ToList());
// add word and add list of occurrances of word
}
}
}
This is slow because you construct string by concatenations in a loop. You have several approaches that are more performant:
Use StringBuilder - unlike string concatenation which constructs a new object each time you add a character, this approach expands the string under construction by larger chunks, preventing excessive garbage creation.
Use LINQ's filtering with Where - this approach constructs an array of chars in a single shot, then constructs a single string from it.
Use regular expression's Replace - this method is optimized to deal with strings of virtually unlimited sizes.
Roll your own algorithm - create an array of chars that corresponds to the length of the original string. Walk through the string, and add the characters that you wish to keep to the array. Use string's constructor that takes the array, the initial index, and the length to construct the string at once.
Looping through every character once is not that bad. You're doing it all in one pass, that's not trivial to avoid.
The problem lies in the fact that the framework will need to allocate a new copy of the (partial) string whenever you do something like
result += s[i];
You can avoid that by introducing a StringBuilder documented here to append non-punctuation characters as you go.
public string removePunctuation(string s)
{
var result = new StringBuilder();
for (int i = 0; i < s.Length; i++) {
if (Char.IsWhiteSpace(s[i])) {
result.Append(" ");
} else if (!Char.IsLetter(s[i]) && !Char.IsNumber(s[i])) {
// do nothing
} else {
result.Append(s[i]);
}
}
return result.ToString();
}
You could further reduce the number of necessary Append calls with a refined algorithm, for example look ahead to the next punctuation and append larger portions at once, or use an existing string manipulation library like RegEx. But the introduction of StringBuilder above should give you a noticable performance gain already.
I was never taught Regex in depth
Use the search provider of your choice, you may end up with a tested solution which you can just study and use: https://stackoverflow.com/a/5871826/1132334
You can use Regex to remove punctuations as below.
public string removePunctuation(string s)
{
string result = Regex.Replace(s, #"[^\w\s]", "");
return result;
}
^ Means: not these characters (letters, numbers).
\w Means: word characters.
\s Means: space characters.

Issue in printing Char Array() which is converted from String

Got stuck with this.. can you please explain what is happening in it? or give me any link!
String s1="C# Example";
Char[] s3 = s1.ToCharArray();
Console.WriteLine("S3 : {0}",s3);
I want to display the Character which is converted. Output displayed is System.Char[]. Now i need to do some changes, but what is that ?
It is possible in two ways.
1) I need to Change it to String, before i'm going to Print.
Or
2) I need to print it with Char by defining the index, (i.e) s3[0];
Am i correct. Anything More?
The explanation of what happens:
Console.WriteLine("{0}", s3) calls s3.ToString().
Because WriteLine() calls ToString() on each argument
Method ToString() isn't overridden in type System.Array so Object.ToString() is called.
Because Char[] is System.Array and all types inherit from Systen.Object.
Which is equivalent to s3.GetType().ToString() and outputs System.Char[].
Because this is the default implementation. Subtypes can override it. For instance, System.String does, StringBuilder too.
Solution A:
If you want to display the characters individually on console then you need to get each character separately and display it using a loop.
foreach(char ch in s3)
{
Console.WriteLine("S3 : {0}", ch);
}
or, using for-loop,
for (int i = 0; i < s3.Length; i++)
{
Console.WriteLine("S3 : {0}", s3[i]);
}
Solution B :
There's anbther way that I prefer which might not be helpful for you but for those who always looks into better solutions it can be an option also.
Use Extension methods,
Add this class with the extension method in your solution,
public static class DisplayExtension
{
public static string DisplayResult(this string input)
{
var resultString = "";
foreach (char ch in input.ToCharArray())
{
resultString += "S3 : " + ch.ToString() + "\n";
}
return resultString;
}
}
And call the DisplayResult() extension method from your program like this,
Console.WriteLine(s1.DisplayResult());
This will give you the same result but extend the re-usability of your code without writing the for loop for all the repeated situation.
Good answers so far, and great explanation from #abatishchev on why WriteLine() prints System.Char[]
How ever I would like to add an additional solution, because using loops inside your WriteLine() will look confusing and its not very pleasing to the eye. For better readability you can use new string()
In this example it would look like this:
String s1="C# Example";
Char[] s3 = s1.ToCharArray();
Console.WriteLine("S3 : {0}",new string(s3));
Console.WriteLine("S3 : {0}",s3);
gives result s3.ToString() which results System.Char[]
Instead create a for loop like:
Console.Write("S3 :");
for(int i=0; i<s3.Length; i++)
{
Console.Write(s3[i]);
}
which gives desired output
char [] str = new char[20];
Suppose str is the character array, and we need to display it. Do the following (provided you enter something in the str using loop):
Console.WriteLine("The string is: {0}", string.Join("",str));
Here, each character in str is joined and displayed.

String insertion problem in c#

I am trying to insert a string at a position for C# string, its failing
here is the snippet.
if(strCellContent.Contains("<"))
{
int pos = strCellContent.IndexOf("<");
strCellContent.Insert(pos,"<");
}
please tell me the solution
The return value contains the new string that you desire.
strCellContent = strCellContent.Insert(pos,"<");
Gunner and Rhapsody have given correct changes, but it's worth knowing why your original attempt failed. The String type is immutable - once you've got a string, you can't change its contents. All the methods which look like they're changing it actually just return a new value. So for example, if you have:
string x = "foo";
string y = x.Replace("o", "e");
the string x refers to will still contain the characters "foo"... but the string y refers to will contain the characters "fee".
This affects all uses of strings, not just the particular situation you're looking at now (which would definitely be better handled using Replace, or even better still a library call which knows how to do all the escaping you need).
I think you might be better of with a Replace instead of an Insert:
strCellContent = strCellContent.Replace("<", "<");
Maybe doing Server.HtmlEncode() is even better:
strCellContent = Server.HtmlEncode(strCellContent);
When I look at your code I think you want to do a replace, but try this:
if(strCellContent.Contains("<"))
{
int pos = strCellContent.IndexOf("<");
strCellContent = strCellContent.Insert(pos,"<");
}
.Contains is not a good idea here, because you need to know the position. This solution will be more efficient.
int pos = strCellContent.IndexOf("<");
if (pos >= 0) //that means the string Contains("<")
{
strCellContent = strCellContent.Insert(pos,"<"); //string is immutable
}
As others have explained with the code, I will add that
The value of the String object is the
content of the sequential collection,
and that value is immutable (that is,
it is read-only).
For more information about the immutability of strings, see the Immutability and the StringBuilder Class section.
from: http://msdn.microsoft.com/en-us/library/system.string.aspx

Piglatin using Arrays

Last night I was messing around with Piglatin using Arrays and found out I could not reverse the process. How would I shift the phrase and take out the Char's "a" and "y" at the end of the word and return the original word in the phrase.
For instance if I entered "piggy" it would come out as "iggypay" shifting the word piggy so "p" is at the end of the word and "ay" is appended.
Here is the example code so you can try it as well.
public string ay;
public string PigLatin(string phrase)
{
string[] pLatin;
ArrayList pLatinPhrase = new ArrayList();
int wordLength;
pLatin = phrase.Split();
foreach (string pl in pLatin)
{
wordLength = pl.Length;
pLatinPhrase.Add(pl.Substring(1, wordLength - 1) + pl.Substring(0, 1) + "ay");
}
foreach (string p in pLatinPhrase)
{
ay += p;
}
return ay;
}
You will notice that is example is not programmed to find vowels and append them to the end along with "ay". Just simply a basic way of doing it.
If you where wondering how to reverse the above try this example of uPiglatinify
public string way;
public string uPigLatinify(string word)
{
string[] latin;
int wordLength;
// Using arrraylist to store split words.
ArrayList Phrase = new ArrayList();
// Split string phrase into words.
latin = word.Split(' ');
foreach (string i in latin)
{
wordLength = i.Length;
if (wordLength > 0)
{
// Grab 3rd letter from the end of word and append to front
// of word chopping off "ay" as it was not included in the indexing.
Phrase.Add(i.Substring(wordLength - 3, 1) + i.Substring(0, wordLength - 3) + " ");
}
}
foreach (string _word in Phrase)
{
// Add words to string and return.
way += _word;
}
return way;
}
Please don’t take this the wrong way, but although you can probably get people here to give you the C# code to implement the algorithm you want, I suspect this is not enough if you want to learn how it works. To learn the basics of programming, there are some good tutorials to delve into (whether websites or books). In particular, if you aspire to be a programmer, you will need to learn not just how to write code. In your example:
You should first write a specification of what your PigLatin function is supposed to do. Think about all the corner-cases: What if the first letter is a vowel? What if there are several consonants at the beginning? What if there are only consonants? What if the input starts with a number, a parenthesis, or a space? What if the input string is empty? Write down exactly what should happen in all of these cases — even if it’s “throw an exception”.
Only then can you implement the algorithm according to the specification (i.e. write the actual C# code). While doing this, you may find that the specification is incomplete, in which case you need to go back and correct it.
Once your code is finished, you need to test it. Run it on several testcases, especially the corner-cases you came up with above: For example, try PigLatin("air"), PigLatin("x"), PigLatin("1"), PigLatin(""), etc. In each case, make yourself aware first what behaviour you expect, and then see if the behaviour matches your expectation. If it doesn’t, you need to go back and fix the code.
Once you have implemented the forward PigLatin algorithm and it works (read: passes all your testcases), then you will already have the skills needed to write the reverse function youself. I guarantee you that you will feel achieved and excited then! Whereas, if you just copy the code from this website, you are setting yourself up for feeling dumb because you will think other people can do it and you can’t.
Of course, we are nonetheless happy to help you with specific technical questions, for example “What is the difference between ArrayList and List<string>?” or “What does the scope of a local variable mean?” (but search first — these may have already been asked before) — but you probably shouldn’t ask to have the code fully written and finished for you.
The work to split the phrase into words and recombine the words after transforming them is the same as in the original case. The difficulty is in un-pig-latin-ifying an individual word. With some error checking, I imagine you could do this:
string UnPigLatinify(string word)
{
if ((word == null) || !Regex.IsMatch(word, #"^\w+ay$", RegexOptions.IgnoreCase))
return word;
return word[word.Length - 3] + word.Substring(0, word.Length - 3);
}
The regular expression just checks to make sure the word is at least 3 letters long, composed of characters, and ends with "ay".
The actual transform takes the third to last letter (the original first letter) and appends the rest of the word minus the "ay" and the original letter.
Is this what you meant?

.NET equivalent of the old vb left(string, length) function

As a non-.NET programmer I'm looking for the .NET equivalent of the old Visual Basic function left(string, length). It was lazy in that it worked for any length string. As expected, left("foobar", 3) = "foo" while, most helpfully, left("f", 3) = "f".
In .NET string.Substring(index, length) throws exceptions for everything out of range. In Java I always had the Apache-Commons lang.StringUtils handy. In Google I don't get very far searching for string functions.
#Noldorin - Wow, thank you for your VB.NET extensions! My first encounter, although it took me several seconds to do the same in C#:
public static class Utils
{
public static string Left(this string str, int length)
{
return str.Substring(0, Math.Min(length, str.Length));
}
}
Note the static class and method as well as the this keyword. Yes, they are as simple to invoke as "foobar".Left(3). See also C# extensions on MSDN.
Here's an extension method that will do the job.
<System.Runtime.CompilerServices.Extension()> _
Public Function Left(ByVal str As String, ByVal length As Integer) As String
Return str.Substring(0, Math.Min(str.Length, length))
End Function
This means you can use it just like the old VB Left function (i.e. Left("foobar", 3) ) or using the newer VB.NET syntax, i.e.
Dim foo = "f".Left(3) ' foo = "f"
Dim bar = "bar123".Left(3) ' bar = "bar"
Another one line option would be something like the following:
myString.Substring(0, Math.Min(length, myString.Length))
Where myString is the string you are trying to work with.
Add a reference to the Microsoft.VisualBasic library and you can use the Strings.Left which is exactly the same method.
Don't forget the null case:
public static string Left(this string str, int count)
{
if (string.IsNullOrEmpty(str) || count < 1)
return string.Empty;
else
return str.Substring(0,Math.Min(count, str.Length));
}
Use:
using System;
public static class DataTypeExtensions
{
#region Methods
public static string Left(this string str, int length)
{
str = (str ?? string.Empty);
return str.Substring(0, Math.Min(length, str.Length));
}
public static string Right(this string str, int length)
{
str = (str ?? string.Empty);
return (str.Length >= length)
? str.Substring(str.Length - length, length)
: str;
}
#endregion
}
It shouldn't error, returns nulls as empty string, and returns trimmed or base values. Use it like "testx".Left(4) or str.Right(12);
You could make your own:
private string left(string inString, int inInt)
{
if (inInt > inString.Length)
inInt = inString.Length;
return inString.Substring(0, inInt);
}
Mine is in C#, and you will have to change it for Visual Basic.
You can either wrap the call to substring in a new function that tests the length of it as suggested in other answers (the right way) or use the Microsoft.VisualBasic namespace and use left directly (generally considered the wrong way!)
I like doing something like this:
string s = "hello how are you";
s = s.PadRight(30).Substring(0,30).Trim(); //"hello how are you"
s = s.PadRight(3).Substring(0,3).Trim(); //"hel"
Though, if you want trailing or beginning spaces then you are out of luck.
I really like the use of Math.Min, it seems to be a better solution.
Another technique is to extend the string object by adding a Left() method.
Here is the source article on this technique:
http://msdn.microsoft.com/en-us/library/bb384936.aspx
Here is my implementation (in VB):
Module StringExtensions
<Extension()>
Public Function Left(ByVal aString As String, Length As Integer)
Return aString.Substring(0, Math.Min(aString.Length, Length))
End Function
End Module
Then put this at the top of any file in which you want to use the extension:
Imports MyProjectName.StringExtensions
Use it like this:
MyVar = MyString.Left(30)
If you want to avoid using an extension method and prevent an under-length error, try this
string partial_string = text.Substring(0, Math.Min(15, text.Length))
// example of 15 character max
Just in a very special case:
If you are doing this left and you will check the data with some partial string, for example:
if(Strings.Left(str, 1)=="*") ...;
Then you can also use C# instance methods, such as StartsWith and EndsWith to perform these tasks.
if(str.StartsWith("*"))...;

Categories

Resources