(string)combination Purpose? - c#

I'm following an exercise which tasks me to...
"Declare two variables of type string with values "Hello" and "World".
Declare a variable of type object. Assign the value obtained of
concatenation of the two string variables (add space if necessary) to
this variable. Print the variable of type object".
Now here was my original solution:
string hi = "Hello";
string wo = "World";
object hiwo = hi + " " + wo;
Console.WriteLine(hiwo);
Console.ReadLine();
I found a good website that gives sample solutions of the exercises I am going through, which I have started to go through comparing to my answers, In this one I noticed I was nearly spot on, apart from an extra line. I've modified my original code to illustrate the comparison more easily.
My modified code:
string firstWord = "Hello";
string secondWord = "World";
object combination = firstWord + " " + secondWord;
Console.WriteLine(combination);
Given Solution:
string firstWord = "Hello";
string secondWord = "World";
object combination = firstWord + " " + secondWord;
string a = (string)combination;
Console.WriteLine(a);
I believe understanding this extra line is the purpose of the exercise. So my question is why is the extra line exists and what the benefits are to having it? The section of the book is understanding types and variables.

The extra line is a type cast:
A cast is a way of explicitly informing the compiler that you intend to make the conversion and that you are aware that data loss might occur.
Usually, a cast doesn't really return a different object. It just checks if the object is, at runtime, of the type you're casting to. That is, the expression firstWord + secondWord returns an object of type string. Assigning it to a variable of type object doesn't change the fact it's really a string. Similarly, doing (string) combination doesn't return a different object – it just tells the compiler that the expression is of type string. (If combination wasn't really a string, the check would fail and throw an exception.)
In this case there is no benefit to having it there I can see. Console.WriteLine(object) converts the object to a string internally, and an object that is already a string will just "convert" to itself.

In your solution when you call
Console.WriteLine(Combination)
.ToString() method is called internally. Therefore you don't feel the difference.
From MSDN
If value is null, only the line terminator is written. Otherwise, the ToString method of value is called to produce its string representation, and the resulting string is written to the standard output stream.
Whereas in the given solution object is first converted to string and then written.
To understand the difference let's take another example
TextBox tb = new TextBox();
Console.WriteLine(tb);
output would be System.Windows.Forms.TextBox, Text: that is the type of object

In your version what is happening in the line Console.WriteLine is a call to the virtual ToString method, which because of being virtual is in fact executed in its version implemented in the string class (which just returns the string).
The given solution explicitly casts the object into string. The difference is thus in increased readability - less things are happening behind the scene - it is made explicit that you're operating on a string instance.

The extra line is basic casting the object to a string type in order for it to be printed out.
Another way would be...
string firstWord = "hello";
string "secondWord = "world";
object combination = string.Format("{0} {1}", firstWord, secondWord);
Console.WriteLine(combination.ToString());

Related

will Substring creates another instance C#?

I am new to C# string I am confused about the
Object.referenceEquals
I was reading some article which says ReferenceEquals check if it same instance or not in the program i am checking if object.ReferenceEquals(s1, s4) even though they point to same data why it is coming as false ?
string s1 = "akhil";
string s2 = "akhil";
Console.WriteLine(object.ReferenceEquals(s1, s2)); //true
s2 = "akhil jain";
Console.WriteLine(object.ReferenceEquals(s1, s2)); //false
//Console.WriteLine(s1 == s2);
//Console.WriteLine(s1.Equals(s2));
string s3 = "akhil";
//1".Substring(0, 5);
Console.WriteLine(s3+" " +s1);
Console.WriteLine(object.ReferenceEquals(s1,s3)); //true
string s4 = "akhil1".Substring(0, 5);
Console.WriteLine(object.ReferenceEquals(s1, s4)); //confusion false why as s4 data is same as s1
The references are the same because a string literal gets interned, Substring returns a new string and a new reference, it doesn't try to second guess your parameters and check the intern pool
String.Intern(String) Method
The common language runtime conserves string storage by maintaining a
table, called the intern pool, that contains a single reference to
each unique literal string declared or created programmatically in
your program. Consequently, an instance of a literal string with a
particular value only exists once in the system.
For example, if you assign the same literal string to several variables, the runtime retrieves the same reference to the literal
string from the intern pool and assigns it to each variable.
Though, useless fact 3454345.2, Since .Net 2, you have been able to turn it off for various reasons you may have
CompilationRelaxations Enum
NoStringInterning Marks an assembly as not requiring string-literal interning. In an application domain, the common
language runtime creates one string object for each unique string
literal, rather than making multiple copies. This behavior, called
string interning, internally requires building auxiliary tables that
consume memory resources.
When instantiating two object, the reference is not equal. The Object.ReferenceEquals method therefore returns false. However, strings are a very special case. If you declare a string in code, the CLR maintains it in a table. This is called the intern pool. This causes two strings that were instantiated with the same value to reference the same object in memory. This will cause Object.ReferenceEquals to return true.
When a string was formed by some operation in your code, it is not automatically interned to the pool. And therefore, it has a different reference, although the content of the string might be the same. This is also explained in the remarks of the documentation of Object.ReferenceEquals here.
Note that the String.Equals() method would return true. In C# you can also use the '==' operator on strings. See your adjusted code below.
string s1 = "akhil";
string s2 = "akhil";
Console.WriteLine(s1.Equals(s2)); //true
s2 = "akhil jain";
Console.WriteLine(s1.Equals(s2)); //false
string s3 = "akhil";
Console.WriteLine(s3 + " " + s1);
Console.WriteLine(s1.Equals(s3)); //true
string s4 = "akhil1".Substring(0, 5);
Console.WriteLine(s1.Equals(s4)); //this now returns true as well
Console.WriteLine(s1 == s4); //so does this
The value of object.ReferenceEquals is false since it checks if both the references point to the same object. ReferenceEquals does not check for data equality, but if both objects occupy the same memory address.
As TheGeneral already mentioned, string literals are interned and stored in a table called intern pool. This is to store string objects efficiently.
When a string literal is assigned to multiple variables, they are pointing to the same address in the intern pool. Hence, you get true for object.ReferenceEquals. But when you compare this with a substring, a new object has been created in the memory. This result in a false when reference is compared since they are two different objects occupying different memory locations.
All the dynamically created strings, or read from an external source are not interned automatically.
If you try the following, you will get true for object.ReferenceEquals:
Console.WriteLine(object.ReferenceEquals(s1, string.Intern(s4)));
You can check with Primitive data types that the ReferenceEquals returns false even when one variable is assigned to another.
int a = 10;
int b = a;
Console.WriteLine(ReferenceEquals(a, b)); //false
This is because each primitive type is stored separately.

In C#, can assignment and parameter passing happen in a single statement?

I am trying to do
string answer = (5 + 6).ToString();
Console.WriteLine(answer);
I would like to find out if assignment and parameter passing can happen simultaneously in a single statement for C#. The compiler doesn't like the following code.
Console.WriteLine(string answer = (5 + 6).ToString());
I would like to find out if assignment and parameter passing can happen simultaneously in a single statement for C#.
They can, but variable declaration can't. This code is not legal:
Console.WriteLine(string answer = (5 + 6).ToString());
But this code is legal:
string answer;
Console.WriteLine(answer = (5 + 6).ToString());
And thanks to assignment chainging, it will output the value 11. But it's probably not the best coding style you could use. For preference, I'd write it like this:
string answer = (5 + 6).ToString();
Console.WriteLine(answer);
It's also not best to think of it as "simultaneous". They are in the same C# statement, but won't be in the same IL statement when compiled, and are not atomic.
Edit: Never mind, I am rusty apparently. Check out the other answer: assignment expressions do exist in C#, but they cannot contain variable declarations.
In C#, assignment is a statement and not an expression. Statements that are not expressions do not have values (unlike C++, where an assignment has a value, namely, the newly assigned value). This is mainly to prefer errors such as if (x=5) ..., since x=5 cannot have a value, this code will fail to build.
If you want to declare, assign, and then use, you're pretty much out of luck, as declaration cannot be used as an expression.
If you only want to assign and use, then using a function with a ref parameter could take care of it.
Example:
T AssignAndGet<T>(ref T variable, T result) {
variable = result;
return result;
}
which could be used as:
string x;
Console.WriteLine(AssignAndGet(x, (5+3).ToString()));

C# Changing a string after it has been created

Okay I know this question is painfully simple, and I'll admit that I am pretty new to C# as well. But the title doesn't describe the entire situation here so hear me out.
I need to alter a URL string which is being created in a C# code behind, removing the substring ".aspx" from the end of the string. So basically I know that my URL, coming into this class, will be something like "Blah.aspx" and I want to get rid of the ".aspx" part of that string. I assume this is quite easy to do by just finding that substring, and removing it if it exists (or some similar strategy, would appreciate if someone has an elegant solution for it if they've thought done it before). Here is the problem:
"Because strings are immutable, it is not possible (without using unsafe code) to modify the value of a string object after it has been created." This is from the MSDN official website. So I'm wondering now, if strings are truly immutable, then I simply can't (shouldn't) alter the string after it has been made. So how can I make sure that what I'm planning to do is safe?
You don't change the string, you change the variable. Instead of that variable referring to a string such as "foo.aspx", alter it to point to a new string that has the value "foo".
As an analogy, adding one to the number two doesn't change the number two. Two is still just the same as it always way, you have changed a variable from referring to one number to refer to another.
As for your specific case, EndsWith and Remove make it easy enough:
if (url.EndsWith(".aspx"))
url = url.Remove(url.Length - ".aspx".Length);
Note here that Remove is taking one string, an integer, and giving us a brand new string, which we need to assign back to our variable. It doesn't change the string itself.
Also note that there is a URI class that you can use for parsing URLs, and it will be able to handle all of the complex situations that can arise, including hashes, query parameters, etc. You should use that to parse out the aspects of a URL that you are interested in.
String immutability is not a problem for normal usage -- it just means that member functions like "Replace", instead of modifying the existing string object, return a new one. In practical terms that usually just means you have to remember to copy the change back to the original, like:
string x = "Blah.aspx";
x.Replace(".aspx", ""); // still "Blah.aspx"
x = x.Replace(".aspx", ""); // now "Blah"
The weirdness around strings comes from the fact that System.String inherits System.Object, yet, because of its immutability, behaves like a value type rather than an object. For example, if you pass a string into a function, there's no way to modify it, unless you pass it by reference:
void Test(string y)
{
y = "bar";
}
void Test(ref string z)
{
z = "baz";
}
string x = "foo";
Test(x); // x is still "foo"
Test(ref x); // x is now "baz"
A String in C# is immutable, as you say. Meaning that this would create multiple String objects in memory:
String s = "String of numbers 0";
s += "1";
s += "2";
So, while the variable s would return to you the value String of numbers 012, internally it required the creation of three strings in memory to accomplish.
In your particular case, the solution is quite simple:
String myPath = "C:\\folder1\\folder2\\myFile.aspx";
myPath = Path.Combine(Path.GetDirectoryName(myPath), Path.GetFileNameWithoutExtension(myPath));
Again, this appears as if myPath has changed, but it really has not. An internal copy and assign took place and you get to keep using the same variable.
Also, if you must preserve the original variable, you could simply make a new variable:
String myPath = "C:\\folder1\\folder2\\myFile.aspx";
String thePath = Path.Combine(Path.GetDirectoryName(myPath), Path.GetFileNameWithoutExtension(myPath));
Either way, you end up with a variable you can use.
Note that the use of the Path methods ensures you get proper path operations, and not blind String replacements that could have unintended side-effects.
String.Replace() will not modify the string. It will create a new one. So the following code:
String myUrl = #"http://mypath.aspx";
String withoutExtension = myUrl.Replace(".aspx", "");
will create a brand-new string which is assigned to withoutExtension.

Why doesn't interning work on copies of a string?

Given:
object literal1 = "abc";
object literal2 = "abc";
object copiedVariable = string.Copy((string)literal1);
if (literal1 == literal2)
Console.WriteLine("objects are equal because of interning");//Are equal
if(literal1 == copiedVariable)
Console.WriteLine("copy is equal");
else
Console.WriteLine("copy not eq");//NOT equal
These results imply that copiedVariable is not subject to string interning. Why?
Is there a circumstance where its useful to have equivalent strings that are not interned or is this behavior due to some language detail?
If you think about it, the interning of strings is a process that it triggered at compile time on literals. Which implies that:
it is implicit when you assign/bind a literal to a variable
it is implicit when you copy a reference (i.e. string a = some_other_string_variable;)
On the other hand, if you create an instance of a string manually - at run-time by using a StringBuilder, or by Copy-ing, than you have to specifically request to intern it by invoking the Intern method of the String class.
Even in the remarks section of the documentation it is stated that:
The common language runtime conserves string storage by maintaining a
table, called the intern pool, that contains a single reference to
each unique literal string declared or created programmatically in
your program. Consequently, an instance of a literal string with a
particular value only exists once in the system. For example, if you
assign the same literal string to several variables, the runtime
retrieves the same reference to the literal string from the intern
pool and assigns it to each variable.
And the documentation for the Copy method of the String class states that it:
Creates a new instance of String with the same value as a specified
String.
which implies that it's not going to just return a reference to the same string (from the intern pool). Again, if it did there wouldn't be much use for it then, would there?!
Some languages requires the result be a copy for certain methods/procedures.
For example in substring type methods. The semantics would then be the same, even if if you call foo.substring(0, foo.length) (and how you would probably implement stringcopy).
Note: IIRC*, this is NOT the case with .NET's implementation of string.Substring though. It is not really clear from MSDN either. (see below)
It returns:
A string that is equivalent to the substring of length length that
begins at startIndex in this instance, or Empty if startIndex is equal
to the length of this instance and length is zero.
It notes:
This method does not modify the value of the current instance.
Instead, it returns a new string with length characters starting from
the startIndex position in the current string.
UPDATE
I remember correctly, it does indeed do a check with string InternalSubString(int startIndex, int length, bool fAlwaysCopy) if fAlwaysCopy is not false. Substring passes false to this method.
UPDATE 2
It looks like string.Copy could have used InternalSubString and passing true to the aforementioned parameter, but looking at the disassembly, it seems to use a slightly more optimized version and possibly save a method call.
Sorry for the redundant information.
* The reason I remember was when implementing the substring procedure for IronScheme, which the R6RS specification requires to make a copy :)

Why should I never use an unsafe block to modify a string?

I have a String which I would like to modify in some way. For example: reverse it or upcase it.
I have discovered that the fastest way to do this is by using a unsafe block and pointers.
For example:
unsafe
{
fixed (char* str = text)
{
*str = 'X';
}
}
Are there any reasons why I should never ever do this?
The .Net framework requires strings to be immutable. Due to this requirement it is able to optimise all sorts of operations.
String interning is one great example of this requirement is leveraged heavily. To speed up some string comparisons (and reduce memory consumption) the .Net framework maintains a Dictionary of pointers, all pre-defined strings will live in this dictionary or any strings where you call the String.intern method on. When the IL instruction ldstr is called it will check the interned dictionary and avoid memory allocation if we already have the string allocated, note: String.Concat will not check for interned strings.
This property of the .net framework means that if you start mucking around directly with strings you can corrupt your intern table and in turn corrupt other references to the same string.
For example:
// these strings get interned
string hello = "hello";
string hello2 = "hello";
string helloworld, helloworld2;
helloworld = hello;
helloworld += " world";
helloworld2 = hello;
helloworld2 += " world";
unsafe
{
// very bad, this changes an interned string which affects
// all app domains.
fixed (char* str = hello2)
{
*str = 'X';
}
fixed (char* str = helloworld2)
{
*str = 'X';
}
}
Console.WriteLine("hello = {0} , hello2 = {1}", hello, hello2);
// output: hello = Xello , hello2 = Xello
Console.WriteLine("helloworld = {0} , helloworld2 = {1}", helloworld, helloworld2);
// output : helloworld = hello world , helloworld2 = Xello world
Are there any reasons why I should never ever do this?
Yes, very simple: Because .NET relies on the fact that strings are immutable. Some operations (e.g. s.SubString(0, s.Length)) actually return a reference to the original string. If this now gets modified, all other references will as well.
Better use a StringBuilder to modify a string since this is the default way.
Put it this way: how would you feel if another programmer decided to replace 0 with 1 everywhere in your code, at execution time? It would play hell with all your assumptions. The same is true with strings. Everyone expects them to be immutable, and codes with that assumption in mind. If you violate that, you are likely to introduce bugs - and they'll be really hard to trace.
Oh dear lord yes.
1) Because that class is not designed to be tampered with.
2) Because strings are designed and expected throughout the framework to be immutable. That means that code that everyone else writes (including MSFT) is expecting a string's underlying value never to change.
3) Because this is premature optimization and that is E V I L.
Agreed about StringBuilder, or just convert your string to an array of chars/bytes and work there. Also, you gave the example of "upcasing" -- the String class has a ToUpper method, and if that's not at least as fast as your unsafe "upcasing", I'll eat my hat.

Categories

Resources