C# Changing a string after it has been created - c#

Okay I know this question is painfully simple, and I'll admit that I am pretty new to C# as well. But the title doesn't describe the entire situation here so hear me out.
I need to alter a URL string which is being created in a C# code behind, removing the substring ".aspx" from the end of the string. So basically I know that my URL, coming into this class, will be something like "Blah.aspx" and I want to get rid of the ".aspx" part of that string. I assume this is quite easy to do by just finding that substring, and removing it if it exists (or some similar strategy, would appreciate if someone has an elegant solution for it if they've thought done it before). Here is the problem:
"Because strings are immutable, it is not possible (without using unsafe code) to modify the value of a string object after it has been created." This is from the MSDN official website. So I'm wondering now, if strings are truly immutable, then I simply can't (shouldn't) alter the string after it has been made. So how can I make sure that what I'm planning to do is safe?

You don't change the string, you change the variable. Instead of that variable referring to a string such as "foo.aspx", alter it to point to a new string that has the value "foo".
As an analogy, adding one to the number two doesn't change the number two. Two is still just the same as it always way, you have changed a variable from referring to one number to refer to another.
As for your specific case, EndsWith and Remove make it easy enough:
if (url.EndsWith(".aspx"))
url = url.Remove(url.Length - ".aspx".Length);
Note here that Remove is taking one string, an integer, and giving us a brand new string, which we need to assign back to our variable. It doesn't change the string itself.
Also note that there is a URI class that you can use for parsing URLs, and it will be able to handle all of the complex situations that can arise, including hashes, query parameters, etc. You should use that to parse out the aspects of a URL that you are interested in.

String immutability is not a problem for normal usage -- it just means that member functions like "Replace", instead of modifying the existing string object, return a new one. In practical terms that usually just means you have to remember to copy the change back to the original, like:
string x = "Blah.aspx";
x.Replace(".aspx", ""); // still "Blah.aspx"
x = x.Replace(".aspx", ""); // now "Blah"
The weirdness around strings comes from the fact that System.String inherits System.Object, yet, because of its immutability, behaves like a value type rather than an object. For example, if you pass a string into a function, there's no way to modify it, unless you pass it by reference:
void Test(string y)
{
y = "bar";
}
void Test(ref string z)
{
z = "baz";
}
string x = "foo";
Test(x); // x is still "foo"
Test(ref x); // x is now "baz"

A String in C# is immutable, as you say. Meaning that this would create multiple String objects in memory:
String s = "String of numbers 0";
s += "1";
s += "2";
So, while the variable s would return to you the value String of numbers 012, internally it required the creation of three strings in memory to accomplish.
In your particular case, the solution is quite simple:
String myPath = "C:\\folder1\\folder2\\myFile.aspx";
myPath = Path.Combine(Path.GetDirectoryName(myPath), Path.GetFileNameWithoutExtension(myPath));
Again, this appears as if myPath has changed, but it really has not. An internal copy and assign took place and you get to keep using the same variable.
Also, if you must preserve the original variable, you could simply make a new variable:
String myPath = "C:\\folder1\\folder2\\myFile.aspx";
String thePath = Path.Combine(Path.GetDirectoryName(myPath), Path.GetFileNameWithoutExtension(myPath));
Either way, you end up with a variable you can use.
Note that the use of the Path methods ensures you get proper path operations, and not blind String replacements that could have unintended side-effects.

String.Replace() will not modify the string. It will create a new one. So the following code:
String myUrl = #"http://mypath.aspx";
String withoutExtension = myUrl.Replace(".aspx", "");
will create a brand-new string which is assigned to withoutExtension.

Related

Applying string manipulation to a variable without setting the variable = to the result? [duplicate]

This question already has answers here:
Why .NET String is immutable? [duplicate]
(13 answers)
Closed 5 years ago.
This has probably been asked before but I can't find it. I don't think my googlefu is good enough. Anyway,
Say I have this:
string cat = "meow"
cat = cat.replace("meow","purr");
cat = Regex.Replace(cat, "purr", "Meow!");
It seems extra to be setting the cat variable to a manipulated copy of it. Is there a way to do this in a pass by ref way so that I don't have to set the variable to the modified version of itself?
To be clear it's lines 2 and 3 above that I'm asking about here.
EDIT: I don't believe the marked dupe is an EXACT dupe of my question. My question is specific to the inquiry of the existence of such a thing. The linked post is specific to an explanation of why it DOESN'T exist. This question is probably still worth keeping for those who aren't familiar with the immutable concept.
In C#, strings are immutable. This means that you can't safely change a string value that a variable refers to. i.e. there is no method that looks like:
string cat = "meow";
cat.replace("meow", "purr");
Console.WriteLine(cat); // expected "purr", but output is still "meow"
You could write wrapper methods for each of the string operations you want to mutate a variable, like:
void replace(ref string value, string regex, string replaceString) {
value = value.replace(regex, replaceString);
}
And then use it like:
replace(cat, "meow", "purr");
BUT, don't do that. It's messy and unmaintainable. Writing methods for each of the String operations you might want to perform, just so you can achieve the illusion of string mutability isn't good practice.
You can try using StringBuilder instead, which would look like:
string cat = "meow";
StringBuilder stringBuilder = new StringBuilder(cat);
stringBuilder.Replace("meow", "purr");
Console.WriteLine(stringBuilder.ToString());
From the documentation:
Because strings are immutable, it is not possible (without using
unsafe code) to modify the value of a string object after it has been
created. However, there are many ways to modify the value of a string
and store the result in a new string object.
What you CAN do however is use the StringBuilder class:
StringBuilder builder = new StringBuilder();
builder.Append("Juan");
builder.Replace("Juan", "Javier");
Console.WriteLine(builder.ToString()); //This outputs "Javier"
StringBuilder also allows you to modify the string at the character level with an index:
builder[0] = 'X';
Console.WriteLine(builder.ToString()); //This outputs "Xavier"

Passing by reference to n-th element in C#

In C, if we have an array, we can pass it by reference to a function. We can also use simple addition of (n-1) to pass the reference starting from n-th element of the array like this:
char *strArr[5];
char *str1 = "I want that!\n";
char *str2 = "I want this!\n";
char *str3 = "I want those!\n";
char *str4 = "I want these!\n";
char *str5 = "I want them!\n";
strArr[0] = str1;
strArr[1] = str2;
strArr[2] = str3;
strArr[3] = str4;
strArr[4] = str5;
printPartially(strArr + 1, 4); //we can pass like this to start printing from 2nd element
....
void printPartially(char** strArrPart, char size){
int i;
for (i = 0; i < size; ++i)
printf(strArrPart[i]);
}
Resulting in these:
I want this!
I want those!
I want these!
I want them!
Process returned 0 (0x0) execution time : 0.006 s
Press any key to continue.
In C#, we can also pass reference to an object by ref (or, out). The object includes array, which is the whole array (or at least, this is how I suppose it works). But how are we to pass by reference to the n-th element of the array such that internal to the function, there is only string[] whose elements are one less than the original string[] without the need to create new array?
Must we use unsafe? I am looking for a solution (if possible) without unsafe
Edit:
I understand that we could pass Array in C# without ref keyword. Perhaps my question sounds quite misleading by mentioning ref when we talk about Array. The point why I put ref there, I should rather put it this way: is the ref keyword can be used, say, to pass the reference to n-th element of the array as much as C does other than passing reference to any object (without mentioning the n-th element or something alike)? My apology for any misunderstanding occurs by my question's phrasing.
The "safe" approach would be to pass an ArraySegment struct instead.
You can of course pass a pointer to a character using unsafe c#, but then you need to worry about buffer overruns.
Incidentally, an Array in C# is (usually) allocated on the heap, so passing it normally (without ref) doesn't mean copying the array- it's still a reference that is passed (just a new one).
Edit:
You won't be able to do it as you do in C in safe code.
A C# array (i.e. string[]) is derived from abstract type Array.
It is not only a simple memory block as it is in C.
So you can't send one of it's element's reference and start iterate from there.
But there are some solutions which will give you the same taste of course (without unsafe):
Like:
As #Chris mentioned you can use ArraySegment<T>.
As Array is also an IEnumerable<T> you can use .Skip and send the returned value. (but this will give you an IEnumerable<T> instead of an Array). But it will allow you iterate.
etc...
If the method should only read from the array, you can use linq:
string[] strings = {"str1", "str2", "str3", ...."str10"};
print(strings.Skip(1).Take(4).ToArray());
Your confusion is a very common one. The essential point is realizing that "reference types" and "passing by reference" (ref keyboard) are totally independent. In this specific case, since string[] is a reference type (as are all arrays), it means the object is not copied when you pass it around, hence you are always referring to the same object.
Modified Version of C# Code:
string[] strArr = new string[5];
strArr[0] = "I want that!\n";
strArr[1] = "I want this!\n";
strArr[2] = "I want those!\n";
strArr[3] = "I want these!\n";
strArr[4] = "I want them!\n";
printPartially(strArr.Skip(1).Take(4).ToArray());
void printPartially(string[] strArr)
{
foreach (string str in strArr)
{
Console.WriteLine(str);
}
}
Question is old, but maybe answer will be useful for someone.
As of C# 7.2 there are much more types to use in that case, ex. Span or Memory.
They allow exactly for the thing you mentioned in your question (and much more).
Here's great article about them
Currently, if you want to use them, remeber to add <LangVersion>7.2</LangVersion> in .csproj file of your project to use C# 7.2 features

In C# can you split a string into variables?

I recently came across a php code where a CSV string was split into two variables:
list($this->field_one, $this->field_two) = explode(",", $fields);
I turned this into:
string[] tmp = s_fields.Split(',');
field_one = tmp[0];
field_two = tmp[1];
Is there a C# equivalent without creating a temporary array?
Jon Skeet said the right thing. GC will do the thing, don't you worry.
But if you like the syntax so much (which is pretty considerable), you can use this, I guess.
public static class MyPhpStyleExtension
{
public void SplitInTwo(this string str, char splitBy, out string first, out string second)
{
var tempArray = str.Split(splitBy);
if (tempArray.length != 2) {
throw new NotSoPhpResultAsIExpectedException(tempArray.length);
}
first = tempArray[0];
second = tempArray[1];
}
}
I almost feel guilty by writing this code. Hope this will do the thing.
The answer to your quite narrow question is no. C# does not provide a 'multi-assignment' capability, so you cannot extract an arbitrary set of values from anything (such as Split()) and break them out into individual named variables.
There is a workaround for a specific number of variables, by writing a parameter with out arguments. See #vlad for an answer based on that.
But why would you want to? C# provides an impressive range of features that will allow you to take apart strings and deal them with the parts in a such a wide range of different ways that the lack of 'multi-assignment' should barely be noticed.
Parsing strings usually involves other operations such as dealing with formatting errors, trimming white space, case folding. There could be less than 2 strings, or more. Requirements could change over time. When you are ready for a more capable string parser, C# will be waiting.
You may use the following approach:
-Create a class that inherits from dynamic object.
-Create a local dictionary that will store your variables values in the inherited class.
-Override the TryGetMember and TrySetMember functions to get and set values from and into the dictionary.
-Now, you can split your string and put it in the dictionary, then access your variable like:
dynamicObject.var1

(string)combination Purpose?

I'm following an exercise which tasks me to...
"Declare two variables of type string with values "Hello" and "World".
Declare a variable of type object. Assign the value obtained of
concatenation of the two string variables (add space if necessary) to
this variable. Print the variable of type object".
Now here was my original solution:
string hi = "Hello";
string wo = "World";
object hiwo = hi + " " + wo;
Console.WriteLine(hiwo);
Console.ReadLine();
I found a good website that gives sample solutions of the exercises I am going through, which I have started to go through comparing to my answers, In this one I noticed I was nearly spot on, apart from an extra line. I've modified my original code to illustrate the comparison more easily.
My modified code:
string firstWord = "Hello";
string secondWord = "World";
object combination = firstWord + " " + secondWord;
Console.WriteLine(combination);
Given Solution:
string firstWord = "Hello";
string secondWord = "World";
object combination = firstWord + " " + secondWord;
string a = (string)combination;
Console.WriteLine(a);
I believe understanding this extra line is the purpose of the exercise. So my question is why is the extra line exists and what the benefits are to having it? The section of the book is understanding types and variables.
The extra line is a type cast:
A cast is a way of explicitly informing the compiler that you intend to make the conversion and that you are aware that data loss might occur.
Usually, a cast doesn't really return a different object. It just checks if the object is, at runtime, of the type you're casting to. That is, the expression firstWord + secondWord returns an object of type string. Assigning it to a variable of type object doesn't change the fact it's really a string. Similarly, doing (string) combination doesn't return a different object – it just tells the compiler that the expression is of type string. (If combination wasn't really a string, the check would fail and throw an exception.)
In this case there is no benefit to having it there I can see. Console.WriteLine(object) converts the object to a string internally, and an object that is already a string will just "convert" to itself.
In your solution when you call
Console.WriteLine(Combination)
.ToString() method is called internally. Therefore you don't feel the difference.
From MSDN
If value is null, only the line terminator is written. Otherwise, the ToString method of value is called to produce its string representation, and the resulting string is written to the standard output stream.
Whereas in the given solution object is first converted to string and then written.
To understand the difference let's take another example
TextBox tb = new TextBox();
Console.WriteLine(tb);
output would be System.Windows.Forms.TextBox, Text: that is the type of object
In your version what is happening in the line Console.WriteLine is a call to the virtual ToString method, which because of being virtual is in fact executed in its version implemented in the string class (which just returns the string).
The given solution explicitly casts the object into string. The difference is thus in increased readability - less things are happening behind the scene - it is made explicit that you're operating on a string instance.
The extra line is basic casting the object to a string type in order for it to be printed out.
Another way would be...
string firstWord = "hello";
string "secondWord = "world";
object combination = string.Format("{0} {1}", firstWord, secondWord);
Console.WriteLine(combination.ToString());

Why doesn't interning work on copies of a string?

Given:
object literal1 = "abc";
object literal2 = "abc";
object copiedVariable = string.Copy((string)literal1);
if (literal1 == literal2)
Console.WriteLine("objects are equal because of interning");//Are equal
if(literal1 == copiedVariable)
Console.WriteLine("copy is equal");
else
Console.WriteLine("copy not eq");//NOT equal
These results imply that copiedVariable is not subject to string interning. Why?
Is there a circumstance where its useful to have equivalent strings that are not interned or is this behavior due to some language detail?
If you think about it, the interning of strings is a process that it triggered at compile time on literals. Which implies that:
it is implicit when you assign/bind a literal to a variable
it is implicit when you copy a reference (i.e. string a = some_other_string_variable;)
On the other hand, if you create an instance of a string manually - at run-time by using a StringBuilder, or by Copy-ing, than you have to specifically request to intern it by invoking the Intern method of the String class.
Even in the remarks section of the documentation it is stated that:
The common language runtime conserves string storage by maintaining a
table, called the intern pool, that contains a single reference to
each unique literal string declared or created programmatically in
your program. Consequently, an instance of a literal string with a
particular value only exists once in the system. For example, if you
assign the same literal string to several variables, the runtime
retrieves the same reference to the literal string from the intern
pool and assigns it to each variable.
And the documentation for the Copy method of the String class states that it:
Creates a new instance of String with the same value as a specified
String.
which implies that it's not going to just return a reference to the same string (from the intern pool). Again, if it did there wouldn't be much use for it then, would there?!
Some languages requires the result be a copy for certain methods/procedures.
For example in substring type methods. The semantics would then be the same, even if if you call foo.substring(0, foo.length) (and how you would probably implement stringcopy).
Note: IIRC*, this is NOT the case with .NET's implementation of string.Substring though. It is not really clear from MSDN either. (see below)
It returns:
A string that is equivalent to the substring of length length that
begins at startIndex in this instance, or Empty if startIndex is equal
to the length of this instance and length is zero.
It notes:
This method does not modify the value of the current instance.
Instead, it returns a new string with length characters starting from
the startIndex position in the current string.
UPDATE
I remember correctly, it does indeed do a check with string InternalSubString(int startIndex, int length, bool fAlwaysCopy) if fAlwaysCopy is not false. Substring passes false to this method.
UPDATE 2
It looks like string.Copy could have used InternalSubString and passing true to the aforementioned parameter, but looking at the disassembly, it seems to use a slightly more optimized version and possibly save a method call.
Sorry for the redundant information.
* The reason I remember was when implementing the substring procedure for IronScheme, which the R6RS specification requires to make a copy :)

Categories

Resources