Advanced Java topics for a C# programmer - c#

I'm a C# programmer writing Java (for Android) and have a few technicalities of Java I'm still not sure about, and worried I'm approaching from a C# angle:
Are parameters in methods passed in the same manner as C#? (Copied for reference types)
Why has the #Override attribute suddenly appeared (I think it's Java 1.5+?)
How is possible to compile an application, if you are missing a dependency for one of the libraries you're using?
Do I need to worry about using += for large string concatenation (e.g. use a StringBuilder instead)
Does Java have operator overloading: should I use equals() or == by default. Basically is object.equals roughly the same as C# (reflection for value types, address reference for reference types)

Answering in order:
Yes, arguments are passed by value in Java, always. That includes references, which are copied (rather than objects being copied). Note that Java doesn't have any equivalent of ref or out
#Override allows you to catch typos at compile-time, just like the "override" modifier in C#
You compile an application only against the libraries you're using. There's no transitive dependency checking, just like there isn't in .NET.
Yes, you should avoid concatenating strings in a loop, just like in .NET. Single-statement concatenation is fine though, just as it is in .NET - and again, the compiler will perform concatenation of constant expressions at compile time, and string literals are interned.
No, Java doesn't have user-defined operator overloading. There are no custom value types in Java; == always compares the primitive value or the reference. (If you compare an Object reference and a primitive value, autoboxing/unboxing gets involved, and I can never remember which way round this works. It's a bad idea though, IMO.)

Are parameters in methods passed in
the same manner as C#? (Copied for
reference types)
All primative types are copied, all objects are actually pointers to objects, the pointer is copied, but the actual object isn't copied.
Why has the #Override attribute
suddenly appeared (I think it's Java
1.5+?)
It hasn't, since Java 1.6 you can also use it to show a method is implementing an interface. The #Override allows you to indicate to the compiler that you think you are overriding a method, the compiler will warn you when you aren't (this is very useful actually, especially if the super class changes)
How is possible to compile an
application, if you are missing a
dependency for one of the libraries
you're using?
I don't think it is.
Do I need to worry about using += for
large string concatenation (e.g. use a
StringBuilder instead)
In some cases. The java compiler + VM is very good at automatically using StringBuilder for you. However it will not always do this. I wouldn't optimize for this (or anything) beforehand.
Does Java have operator overloading:
should I use equals() or == by
default. Basically is object.equals
roughly the same as C# (reflection for
value types, address reference for
reference types)
No it doesn't have operator overloading.

About String concatenation I think it depends if you add static strings together or if you concatene variables. With variables it's better to use StringBuilder. :).
So :
String s = "A";
s += "B";
s += "C";
is fine.
But here it's better with StringBuilder
String s = "A";
s += variable_b;
s += variable_c;

Related

C# 7.2 use of "in parameter" for operators

In C# 7.2, we saw the introduction of the in modifier for method parameters to pass read-only references to objects. I'm working on a new .NET Standard project using 7.2, and out of curiosity I tried compiling with the in keyword on the parameters for the equality operators for a struct.
i.e. - public static bool operator == (in Point l, in Point r)
not - public static bool operator == (Point l, Point r)
I was initially a bit surprised that this worked, but after thinking about it a bit I realized that there is probably no functional difference between the two versions of the operator. I wanted to confirm these suspicions, but after a somewhat thorough search around, I can't find anything that explicitly talks about using the in keyword in operator overloads.
So my question is whether or not this actually has a functional difference, and if it does, is there any particular reason to encourage or discourage the use of in with operator arguments. My initial thoughts are that there is no difference, particularly if the operator is inlined. However, if it does make a difference, it seems like in parameters should be used everywhere (everywhere that readonly references make sense, that is), as they provide a speed bonus, and, unlike ref and out, don't require the user to prepend the those keywords when passing objects. This would allow more efficient value-type object passing without a single change on the user of the methods and operators.
Overall, this may go beyond the sort of small-scale optimizations that most C# developers worry about, but I am curious as to whether or not it has an effect.
whether or not this actually has a functional difference... My initial thoughts are that there is no difference, particularly if the operator is inlined
Since the operator == overload is invoked like a regular static method in MSIL, it has the functional difference. It can help to avoid unnecessary copying like in a regular method.
is there any particular reason to encourage or discourage the use of in with operator arguments.
According to this article it is recommended to apply in modifier when value types are larger than System.IntPtr.Size. But it is important that the value type should be readonly struct. Otherwise in modifier can harm the performance because the compiler will create a defensive copy when calling struct's methods and properties since they can change the state of the argument.

Is there any reason to use string over String in c# (IDE0001) [duplicate]

What are the differences between these two and which one should I use?
string s = "Hello world!";
String s = "Hello world!";
string is an alias in C# for System.String.
So technically, there is no difference. It's like int vs. System.Int32.
As far as guidelines, it's generally recommended to use string any time you're referring to an object.
e.g.
string place = "world";
Likewise, I think it's generally recommended to use String if you need to refer specifically to the class.
e.g.
string greet = String.Format("Hello {0}!", place);
This is the style that Microsoft tends to use in their examples.
It appears that the guidance in this area may have changed, as StyleCop now enforces the use of the C# specific aliases.
Just for the sake of completeness, here's a brain dump of related information...
As others have noted, string is an alias for System.String. Assuming your code using String compiles to System.String (i.e. you haven't got a using directive for some other namespace with a different String type), they compile to the same code, so at execution time there is no difference whatsoever. This is just one of the aliases in C#. The complete list is:
object: System.Object
string: System.String
bool: System.Boolean
byte: System.Byte
sbyte: System.SByte
short: System.Int16
ushort: System.UInt16
int: System.Int32
uint: System.UInt32
long: System.Int64
ulong: System.UInt64
float: System.Single
double: System.Double
decimal: System.Decimal
char: System.Char
Apart from string and object, the aliases are all to value types. decimal is a value type, but not a primitive type in the CLR. The only primitive type which doesn't have an alias is System.IntPtr.
In the spec, the value type aliases are known as "simple types". Literals can be used for constant values of every simple type; no other value types have literal forms available. (Compare this with VB, which allows DateTime literals, and has an alias for it too.)
There is one circumstance in which you have to use the aliases: when explicitly specifying an enum's underlying type. For instance:
public enum Foo : UInt32 {} // Invalid
public enum Bar : uint {} // Valid
That's just a matter of the way the spec defines enum declarations - the part after the colon has to be the integral-type production, which is one token of sbyte, byte, short, ushort, int, uint, long, ulong, char... as opposed to a type production as used by variable declarations for example. It doesn't indicate any other difference.
Finally, when it comes to which to use: personally I use the aliases everywhere for the implementation, but the CLR type for any APIs. It really doesn't matter too much which you use in terms of implementation - consistency among your team is nice, but no-one else is going to care. On the other hand, it's genuinely important that if you refer to a type in an API, you do so in a language-neutral way. A method called ReadInt32 is unambiguous, whereas a method called ReadInt requires interpretation. The caller could be using a language that defines an int alias for Int16, for example. The .NET framework designers have followed this pattern, good examples being in the BitConverter, BinaryReader and Convert classes.
String stands for System.String and it is a .NET Framework type. string is an alias in the C# language for System.String. Both of them are compiled to System.String in IL (Intermediate Language), so there is no difference. Choose what you like and use that. If you code in C#, I'd prefer string as it's a C# type alias and well-known by C# programmers.
I can say the same about (int, System.Int32) etc..
The best answer I have ever heard about using the provided type aliases in C# comes from Jeffrey Richter in his book CLR Via C#. Here are his 3 reasons:
I've seen a number of developers confused, not knowing whether to use string or String in their code. Because in C# the string (a keyword) maps exactly to System.String (an FCL type), there is no difference and either can be used.
In C#, long maps to System.Int64, but in a different programming language, long could map to an Int16 or Int32. In fact, C++/CLI does in fact treat long as an Int32. Someone reading source code in one language could easily misinterpret the code's intention if he or she were used to programming in a different programming language. In fact, most languages won't even treat long as a keyword and won't compile code that uses it.
The FCL has many methods that have type names as part of their method names. For example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32, ReadSingle, and so on, and the System.Convert type offers methods such as ToBoolean, ToInt32, ToSingle, and so on. Although it's legal to write the following code, the line with float feels very unnatural to me, and it's not obvious that the line is correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
So there you have it. I think these are all really good points. I however, don't find myself using Jeffrey's advice in my own code. Maybe I am too stuck in my C# world but I end up trying to make my code look like the framework code.
string is a reserved word, but String is just a class name.
This means that string cannot be used as a variable name by itself.
If for some reason you wanted a variable called string, you'd see only the first of these compiles:
StringBuilder String = new StringBuilder(); // compiles
StringBuilder string = new StringBuilder(); // doesn't compile
If you really want a variable name called string you can use # as a prefix:
StringBuilder #string = new StringBuilder();
Another critical difference: Stack Overflow highlights them differently.
There is one difference - you can't use String without using System; beforehand.
It's been covered above; however, you can't use string in reflection; you must use String.
System.String is the .NET string class - in C# string is an alias for System.String - so in use they are the same.
As for guidelines I wouldn't get too bogged down and just use whichever you feel like - there are more important things in life and the code is going to be the same anyway.
If you find yourselves building systems where it is necessary to specify the size of the integers you are using and so tend to use Int16, Int32, UInt16, UInt32 etc. then it might look more natural to use String - and when moving around between different .net languages it might make things more understandable - otherwise I would use string and int.
I prefer the capitalized .NET types (rather than the aliases) for formatting reasons. The .NET types are colored the same as other object types (the value types are proper objects, after all).
Conditional and control keywords (like if, switch, and return) are lowercase and colored dark blue (by default). And I would rather not have the disagreement in use and format.
Consider:
String someString;
string anotherString;
string and String are identical in all ways (except the uppercase "S"). There are no performance implications either way.
Lowercase string is preferred in most projects due to the syntax highlighting
This YouTube video demonstrates practically how they differ.
But now for a long textual answer.
When we talk about .NET there are two different things one there is .NET framework and the other there are languages (C#, VB.NET etc) which use that framework.
"System.String" a.k.a "String" (capital "S") is a .NET framework data type while "string" is a C# data type.
In short "String" is an alias (the same thing called with different names) of "string". So technically both the below code statements will give the same output.
String s = "I am String";
or
string s = "I am String";
In the same way, there are aliases for other C# data types as shown below:
object: System.Object, string: System.String, bool: System.Boolean, byte: System.Byte, sbyte: System.SByte, short: System.Int16 and so on.
Now the million-dollar question from programmer's point of view: So when to use "String" and "string"?
The first thing to avoid confusion use one of them consistently. But from best practices perspective when you do variable declaration it's good to use "string" (small "s") and when you are using it as a class name then "String" (capital "S") is preferred.
In the below code the left-hand side is a variable declaration and it is declared using "string". On the right-hand side, we are calling a method so "String" is more sensible.
string s = String.ToUpper() ;
C# is a language which is used together with the CLR.
string is a type in C#.
System.String is a type in the CLR.
When you use C# together with the CLR string will be mapped to System.String.
Theoretically, you could implement a C#-compiler that generated Java bytecode. A sensible implementation of this compiler would probably map string to java.lang.String in order to interoperate with the Java runtime library.
Lower case string is an alias for System.String.
They are the same in C#.
There's a debate over whether you should use the System types (System.Int32, System.String, etc.) types or the C# aliases (int, string, etc). I personally believe you should use the C# aliases, but that's just my personal preference.
string is just an alias for System.String. The compiler will treat them identically.
The only practical difference is the syntax highlighting as you mention, and that you have to write using System if you use String.
Both are same. But from coding guidelines perspective it's better to use string instead of String. This is what generally developers use. e.g. instead of using Int32 we use int as int is alias to Int32
FYI
“The keyword string is simply an alias for the predefined class System.String.” - C# Language Specification 4.2.3
http://msdn2.microsoft.com/En-US/library/aa691153.aspx
As the others are saying, they're the same. StyleCop rules, by default, will enforce you to use string as a C# code style best practice, except when referencing System.String static functions, such as String.Format, String.Join, String.Concat, etc...
New answer after 6 years and 5 months (procrastination).
While string is a reserved C# keyword that always has a fixed meaning, String is just an ordinary identifier which could refer to anything. Depending on members of the current type, the current namespace and the applied using directives and their placement, String could be a value or a type distinct from global::System.String.
I shall provide two examples where using directives will not help.
First, when String is a value of the current type (or a local variable):
class MySequence<TElement>
{
public IEnumerable<TElement> String { get; set; }
void Example()
{
var test = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
}
}
The above will not compile because IEnumerable<> does not have a non-static member called Format, and no extension methods apply. In the above case, it may still be possible to use String in other contexts where a type is the only possibility syntactically. For example String local = "Hi mum!"; could be OK (depending on namespace and using directives).
Worse: Saying String.Concat(someSequence) will likely (depending on usings) go to the Linq extension method Enumerable.Concat. It will not go to the static method string.Concat.
Secondly, when String is another type, nested inside the current type:
class MyPiano
{
protected class String
{
}
void Example()
{
var test1 = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
String test2 = "Goodbye";
}
}
Neither statement in the Example method compiles. Here String is always a piano string, MyPiano.String. No member (static or not) Format exists on it (or is inherited from its base class). And the value "Goodbye" cannot be converted into it.
Using System types makes it easier to port between C# and VB.Net, if you are into that sort of thing.
Against what seems to be common practice among other programmers, I prefer String over string, just to highlight the fact that String is a reference type, as Jon Skeet mentioned.
string is an alias (or shorthand) of System.String. That means, by typing string we meant System.String. You can read more in think link: 'string' is an alias/shorthand of System.String.
I'd just like to add this to lfousts answer, from Ritchers book:
The C# language specification states, “As a matter of style, use of the keyword is favored over
use of the complete system type name.” I disagree with the language specification; I prefer
to use the FCL type names and completely avoid the primitive type names. In fact, I wish that
compilers didn’t even offer the primitive type names and forced developers to use the FCL
type names instead. Here are my reasons:
I’ve seen a number of developers confused, not knowing whether to use string
or String in their code. Because in C# string (a keyword) maps exactly to
System.String (an FCL type), there is no difference and either can be used. Similarly,
I’ve heard some developers say that int represents a 32-bit integer when the application
is running on a 32-bit OS and that it represents a 64-bit integer when the application
is running on a 64-bit OS. This statement is absolutely false: in C#, an int always maps
to System.Int32, and therefore it represents a 32-bit integer regardless of the OS the
code is running on. If programmers would use Int32 in their code, then this potential
confusion is also eliminated.
In C#, long maps to System.Int64, but in a different programming language, long
could map to an Int16 or Int32. In fact, C++/CLI does treat long as an Int32.
Someone reading source code in one language could easily misinterpret the code’s
intention if he or she were used to programming in a different programming language.
In fact, most languages won’t even treat long as a keyword and won’t compile code
that uses it.
The FCL has many methods that have type names as part of their method names. For
example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32,
ReadSingle, and so on, and the System.Convert type offers methods such as
ToBoolean, ToInt32, ToSingle, and so on. Although it’s legal to write the following
code, the line with float feels very unnatural to me, and it’s not obvious that the line is
correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
Many programmers that use C# exclusively tend to forget that other programming
languages can be used against the CLR, and because of this, C#-isms creep into the
class library code. For example, Microsoft’s FCL is almost exclusively written in C# and
developers on the FCL team have now introduced methods into the library such as
Array’s GetLongLength, which returns an Int64 value that is a long in C# but not
in other languages (like C++/CLI). Another example is System.Linq.Enumerable’s
LongCount method.
I didn't get his opinion before I read the complete paragraph.
String (System.String) is a class in the base class library. string (lower case) is a reserved work in C# that is an alias for System.String. Int32 vs int is a similar situation as is Boolean vs. bool. These C# language specific keywords enable you to declare primitives in a style similar to C.
It's a matter of convention, really. string just looks more like C/C++ style. The general convention is to use whatever shortcuts your chosen language has provided (int/Int for Int32). This goes for "object" and decimal as well.
Theoretically this could help to port code into some future 64-bit standard in which "int" might mean Int64, but that's not the point, and I would expect any upgrade wizard to change any int references to Int32 anyway just to be safe.
#JaredPar (a developer on the C# compiler and prolific SO user!) wrote a great blog post on this issue. I think it is worth sharing here. It is a nice perspective on our subject.
string vs. String is not a style debate
[...]
The keyword string has concrete meaning in C#. It is the type System.String which exists in the core runtime assembly. The runtime intrinsically understands this type and provides the capabilities developers expect for strings in .NET. Its presence is so critical to C# that if that type doesn’t exist the compiler will exit before attempting to even parse a line of code. Hence string has a precise, unambiguous meaning in C# code.
The identifier String though has no concrete meaning in C#. It is an identifier that goes through all the name lookup rules as Widget, Student, etc … It could bind to string or it could bind to a type in another assembly entirely whose purposes may be entirely different than string. Worse it could be defined in a way such that code like String s = "hello"; continued to compile.
class TricksterString {
void Example() {
String s = "Hello World"; // Okay but probably not what you expect.
}
}
class String {
public static implicit operator String(string s) => null;
}
The actual meaning of String will always depend on name resolution.
That means it depends on all the source files in the project and all
the types defined in all the referenced assemblies. In short it
requires quite a bit of context to know what it means.
True that in the vast majority of cases String and string will bind to
the same type. But using String still means developers are leaving
their program up to interpretation in places where there is only one
correct answer. When String does bind to the wrong type it can leave
developers debugging for hours, filing bugs on the compiler team, and
generally wasting time that could’ve been saved by using string.
Another way to visualize the difference is with this sample:
string s1 = 42; // Errors 100% of the time
String s2 = 42; // Might error, might not, depends on the code
Many will argue that while this is information technically accurate using String is still fine because it’s exceedingly rare that a codebase would define a type of this name. Or that when String is defined it’s a sign of a bad codebase.
[...]
You’ll see that String is defined for a number of completely valid purposes: reflection helpers, serialization libraries, lexers, protocols, etc … For any of these libraries String vs. string has real consequences depending on where the code is used.
So remember when you see the String vs. string debate this is about semantics, not style. Choosing string gives crisp meaning to your codebase. Choosing String isn’t wrong but it’s leaving the door open for surprises in the future.
Note: I copy/pasted most of the blog posts for archive reasons. I ignore some parts, so I recommend skipping and reading the blog post if you can.
String is not a keyword and it can be used as Identifier whereas string is a keyword and cannot be used as Identifier. And in function point of view both are same.
Coming late to the party: I use the CLR types 100% of the time (well, except if forced to use the C# type, but I don't remember when the last time that was).
I originally started doing this years ago, as per the CLR books by Ritchie. It made sense to me that all CLR languages ultimately have to be able to support the set of CLR types, so using the CLR types yourself provided clearer, and possibly more "reusable" code.
Now that I've been doing it for years, it's a habit and I like the coloration that VS shows for the CLR types.
The only real downer is that auto-complete uses the C# type, so I end up re-typing automatically generated types to specify the CLR type instead.
Also, now, when I see "int" or "string", it just looks really wrong to me, like I'm looking at 1970's C code.
There is no difference.
The C# keyword string maps to the .NET type System.String - it is an alias that keeps to the naming conventions of the language.
Similarly, int maps to System.Int32.
There's a quote on this issue from Daniel Solis' book.
All the predefined types are mapped directly to
underlying .NET types. The C# type names (string) are simply aliases for the
.NET types (String or System.String), so using the .NET names works fine syntactically, although
this is discouraged. Within a C# program, you should use the C# names
rather than the .NET names.
Yes, that's no difference between them, just like the bool and Boolean.
string is a keyword, and you can't use string as an identifier.
String is not a keyword, and you can use it as an identifier:
Example
string String = "I am a string";
The keyword string is an alias for
System.String aside from the keyword issue, the two are exactly
equivalent.
typeof(string) == typeof(String) == typeof(System.String)

Is using var actually slow? If so, why?

I am learning C# and .NET, and I frequently use the keyword var in my code. I got the idea from Eric Lippert and I like how it increases my code's maintainability.
I am wondering, though... much has been written in blogs about slow heap-located refs, yet I am not observing this myself. Is this actually slow? I am referring to slow compile times due to type inferencing.
You state:
I am referring to slow time for compile due to type 'inferencing'
This does not slow down the compiler. The compiler already has to know the result type of the expression, in order to check compatibility (direct or indirect) of the assignment. In some ways using this already-known type removes a few things (the potential to have to check for inheritance, interfaces and conversion operators, for example).
It also doesn't slow down the runtime; they are fully static compiled like regular c# variables (which they are).
In short... it doesn't.
'var' in C# is not a VARIANT like you're used to in VB. var is simply syntactic sugar the compiler lets you use to shorthand the type. The compiler figures out the type of the right-hand side of the expression and sets your variable to that type. It has no performance impact at all - just the same as if you'd typed the full type expression:
var x = new X();
exactly the same as
X x = new X();
This seems like a trivial example, and it is. This really shines when the expression is much more complex or even 'unexpressable' (like anonymous types) and enumerables.
Var is replaced at compile time with your actual variable type. Are you thinking of dynamic?
A "variant" is typeless, so access to state (or internal state conversion) always must go through two steps: (1) Determine the "real" internal type, and (2) Extract the relevant state from that "real" internal type.
You do not have that two-step process when you start with a typed object.
True, a "variant" thus has this additional overhead. The appropriate use is in those cases where you want the convenience of any-type for code simplicity, like is done with most scripting languages, or very high-level APIs. In those cases, the "variant" overhead is often not significant (since you are working at a high-level API anyway).
If you're talking about "var", though, then that is merely a convenience way for you to say, "Compiler, put the proper type here" because you don't want to do that work, and the compiler should be able to figure it out. In that case, "var" doesn't represent a (runtime) "variant", but rather a mere source-code specification syntax.
The compiler infers from the constructor the type.
var myString = "123"; is no different from string myString = "123";
Also, generally speaking, reference types live on the heap and value types live on the stack, regardless if they're declared using var.

Why is C# statically typed?

I am a PHP web programmer who is trying to learn C#.
I would like to know why C# requires me to specify the data type when creating a variable.
Class classInstance = new Class();
Why do we need to know the data type before a class instance?
As others have said, C# is static/strongly-typed. But I take your question more to be "Why would you want C# to be static/strongly-typed like this? What advantages does this have over dynamic languages?"
With that in mind, there are lots of good reasons:
Stability Certain kinds of errors are now caught automatically by the compiler, before the code ever makes it anywhere close to production.
Readability/Maintainability You are now providing more information about how the code is supposed to work to future developers who read it. You add information that a specific variable is intended to hold a certain kind of value, and that helps programmers reason about what the purpose of that variable is.
This is probably why, for example, Microsoft's style guidelines recommended that VB6 programmers put a type prefix with variable names, but that VB.Net programmers do not.
Performance This is the weakest reason, but late-binding/duck typing can be slower. In the end, a variable refers to memory that is structured in some specific way. Without strong types, the program will have to do extra type verification or conversion behind the scenes at runtime as you use memory that is structured one way physically as if it were structured in another way logically.
I hesitate to include this point, because ultimately you often have to do those conversions in a strongly typed language as well. It's just that the strongly typed language leaves the exact timing and extent of the conversion to the programmer, and does no extra work unless it needs to be done. It also allows the programmer to force a more advantageous data type. But these really are attributes of the programmer, rather than the platform.
That would itself be a weak reason to omit the point, except that a good dynamic language will often make better choices than the programmer. This means a dynamic language can help many programmers write faster programs. Still, for good programmers, strongly-typed languages have the potential to be faster.
Better Dev Tools If your IDE knows what type a variable is expected to be, it can give you additional help about what kinds of things that variable can do. This is much harder for the IDE to do if it has to infer the type for you. And if you get more help with the minutia of an API from the IDE, then you as a developer will be able to get your head around a larger, richer API, and get there faster.
Or perhaps you were just wondering why you have to specify the class name twice for the same variable on the same line? The answer is two-fold:
Often you don't. In C# 3.0 and later you can use the var keyword instead of the type name in many cases. Variables created this way are still statically typed, but the type is now inferred for you by the compiler.
Thanks to inheritance and interfaces sometimes the type on the left-hand side doesn't match the type on the right hand side.
It's simply how the language was designed. C# is a C-style language and follows in the pattern of having types on the left.
In C# 3.0 and up you can kind of get around this in many cases with local type inference.
var variable = new SomeClass();
But at the same time you could also argue that you are still declaring a type on the LHS. Just that you want the compiler to pick it for you.
EDIT
Please read this in the context of the users original question
why do we need [class name] before a variable name?
I wanted to comment on several other answers in this thread. A lot of people are giving "C# is statically type" as an answer. While the statement is true (C# is statically typed), it is almost completely unrelated to the question. Static typing does not necessitate a type name being to the left of the variable name. Sure it can help but that is a language designer choice not a necessary feature of static typed languages.
These is easily provable by considering other statically typed languages such as F#. Types in F# appear on the right of a variable name and very often can be altogether ommitted. There are also several counter examples. PowerShell for instance is extremely dynamic and puts all of its type, if included, on the left.
One of the main reasons is that you can specify different types as long as the type on the left hand side of the assignment is a parent type of the type on the left (or an interface that is implemented on that type).
For example given the following types:
class Foo { }
class Bar : Foo { }
interface IBaz { }
class Baz : IBaz { }
C# allows you to do this:
Foo f = new Bar();
IBaz b = new Baz();
Yes, in most cases the compiler could infer the type of the variable from the assignment (like with the var keyword) but it doesn't for the reason I have shown above.
Edit: As a point of order - while C# is strongly-typed the important distinction (as far as this discussion is concerned) is that it is in fact also a statically-typed language. In other words the C# compiler does static type checking at compilation time.
C# is a statically-typed, strongly-typed language like C or C++. In these languages all variables must be declared to be of a specific type.
Ultimately because Anders Hejlsberg said so...
You need [class name] in front because there are many situations in which the first [class name] is different from the second, like:
IMyCoolInterface obj = new MyInterfaceImplementer();
MyBaseType obj2 = new MySubTypeOfBaseType();
etc. You can also use the word 'var' if you don't want to specify the type explicitely.
Why do we need to know the data type
before a class instance?
You don't! Read from right to left. You create the variable and then you store it in a type safe variable so you know what type that variable is for later use.
Consider the following snippet, it would be a nightmare to debug if you didn't receive the errors until runtime.
void FunctionCalledVeryUnfrequently()
{
ClassA a = new ClassA();
ClassB b = new ClassB();
ClassA a2 = new ClassB(); //COMPILER ERROR(thank god)
//100 lines of code
DoStuffWithA(a);
DoStuffWithA(b); //COMPILER ERROR(thank god)
DoStuffWithA(a2);
}
When you'r thinking you can replace the new Class() with a number or a string and the syntax will make much more sense. The following example might be a bit verbose but might help to understand why it's designed the way it is.
string s = "abc";
string s2 = new string(new char[]{'a', 'b', 'c'});
//Does exactly the same thing
DoStuffWithAString("abc");
DoStuffWithAString(new string(new char[]{'a', 'b', 'c'}));
//Does exactly the same thing
C#, as others have pointed out, is a strongly, statically-typed language.
By stating up front what the type you're intending to create is, you'll receive compile-time warnings when you try to assign an illegal value. By stating up front what type of parameters you accept in methods, you receive those same compile-time warnings when you accidentally pass nonsense into a method that isn't expecting it. It removes the overhead of some paranoia on your behalf.
Finally, and rather nicely, C# (and many other languages) doesn't have the same ridiculous, "convert anything to anything, even when it doesn't make sense" mentality that PHP does, which quite frankly can trip you up more times than it helps.
c# is a strongly-typed language, like c++ or java. Therefore it needs to know the type of the variable. you can fudge it a bit in c# 3.0 via the var keyword. That lets the compiler infer the type.
That's the difference between a strongly typed and weakly typed language. C# (and C, C++, Java, most more powerful languages) are strongly typed so you must declare the variable type.
When we define variables to hold data we have to specify the type of data that those variables will hold. The compiler then checks that what we are doing with the data makes sense to it, i.e. follows the rules. We can't for example store text in a number - the compiler will not allow it.
int a = "fred"; // Not allowed. Cannot implicitly convert 'string' to 'int'
The variable a is of type int, and assigning it the value "fred" which is a text string breaks the rules- the compiler is unable to do any kind of conversion of this string.
In C# 3.0, you can use the 'var' keyword - this uses static type inference to work out what the type of the variable is at compile time
var foo = new ClassName();
variable 'foo' will be of type 'ClassName' from then on.
One things that hasn't been mentioned is that C# is a CLS (Common Language Specification) compliant language. This is a set of rules that a .NET language has to adhere to in order to be interopable with other .NET languages.
So really C# is just keeping to these rules. To quote this MSDN article:
The CLS helps enhance and ensure
language interoperability by defining
a set of features that developers can
rely on to be available in a wide
variety of languages. The CLS also
establishes requirements for CLS
compliance; these help you determine
whether your managed code conforms to
the CLS and to what extent a given
tool supports the development of
managed code that uses CLS features.
If your component uses only CLS
features in the API that it exposes to
other code (including derived
classes), the component is guaranteed
to be accessible from any programming
language that supports the CLS.
Components that adhere to the CLS
rules and use only the features
included in the CLS are said to be
CLS-compliant components
Part of the CLS is the CTS the Common Type System.
If that's not enough acronyms for you then there's a tonne more in .NET such as CLI, ILasm/MSIL, CLR, BCL, FCL,
Because C# is a strongly typed language
Static typing also allows the compiler to make better optimizations, and skip certain steps. Take overloading for example, where you have multiple methods or operators with the same name differing only by their arguments. With a dynamic language, the runtime would need to grade each version in order to determine which is the best match. With a static language like this, the final code simply points directly to the appropriate overload.
Static typing also aids in code maintenance and refactoring. My favorite example being the Rename feature of many higher-end IDEs. Thanks to static typing, the IDE can find with certainty every occurrence of the identifier in your code, and leave unrelated identifiers with the same name intact.
I didn't notice if it were mentioned yet or not, but C# 4.0 introduces dynamic checking VIA the dynamic keyword. Though I'm sure you'd want to avoid it when it's not necessary.
Why C# requires me to specify the data type when creating a variable.
Why do we need to know the data type before a class instance?
I think one thing that most answers haven't referenced is the fact that C# was originally meant and designed as "managed", "safe" language among other things and a lot of those goals are arrived at via static / compile time verifiability. Knowing the variable datatype explicitly makes this problem MUCH easier to solve. Meaning that one can make several automated assessments (C# compiler, not JIT) about possible errors / undesirable behavior without ever allowing execution.
That verifiability as a side effect also gives you better readability, dev tools, stability etc. because if an automated algorithm can understand better what the code will do when it actually runs, so can you :)
Statically typed means that Compiler can perform some sort of checks at Compile time not at run time. Every variable is of particular or strong type in Static type. C# is strongly definitely strongly typed.

What is the difference between String and string in C#?

What are the differences between these two and which one should I use?
string s = "Hello world!";
String s = "Hello world!";
string is an alias in C# for System.String.
So technically, there is no difference. It's like int vs. System.Int32.
As far as guidelines, it's generally recommended to use string any time you're referring to an object.
e.g.
string place = "world";
Likewise, I think it's generally recommended to use String if you need to refer specifically to the class.
e.g.
string greet = String.Format("Hello {0}!", place);
This is the style that Microsoft tends to use in their examples.
It appears that the guidance in this area may have changed, as StyleCop now enforces the use of the C# specific aliases.
Just for the sake of completeness, here's a brain dump of related information...
As others have noted, string is an alias for System.String. Assuming your code using String compiles to System.String (i.e. you haven't got a using directive for some other namespace with a different String type), they compile to the same code, so at execution time there is no difference whatsoever. This is just one of the aliases in C#. The complete list is:
object: System.Object
string: System.String
bool: System.Boolean
byte: System.Byte
sbyte: System.SByte
short: System.Int16
ushort: System.UInt16
int: System.Int32
uint: System.UInt32
long: System.Int64
ulong: System.UInt64
float: System.Single
double: System.Double
decimal: System.Decimal
char: System.Char
Apart from string and object, the aliases are all to value types. decimal is a value type, but not a primitive type in the CLR. The only primitive type which doesn't have an alias is System.IntPtr.
In the spec, the value type aliases are known as "simple types". Literals can be used for constant values of every simple type; no other value types have literal forms available. (Compare this with VB, which allows DateTime literals, and has an alias for it too.)
There is one circumstance in which you have to use the aliases: when explicitly specifying an enum's underlying type. For instance:
public enum Foo : UInt32 {} // Invalid
public enum Bar : uint {} // Valid
That's just a matter of the way the spec defines enum declarations - the part after the colon has to be the integral-type production, which is one token of sbyte, byte, short, ushort, int, uint, long, ulong, char... as opposed to a type production as used by variable declarations for example. It doesn't indicate any other difference.
Finally, when it comes to which to use: personally I use the aliases everywhere for the implementation, but the CLR type for any APIs. It really doesn't matter too much which you use in terms of implementation - consistency among your team is nice, but no-one else is going to care. On the other hand, it's genuinely important that if you refer to a type in an API, you do so in a language-neutral way. A method called ReadInt32 is unambiguous, whereas a method called ReadInt requires interpretation. The caller could be using a language that defines an int alias for Int16, for example. The .NET framework designers have followed this pattern, good examples being in the BitConverter, BinaryReader and Convert classes.
String stands for System.String and it is a .NET Framework type. string is an alias in the C# language for System.String. Both of them are compiled to System.String in IL (Intermediate Language), so there is no difference. Choose what you like and use that. If you code in C#, I'd prefer string as it's a C# type alias and well-known by C# programmers.
I can say the same about (int, System.Int32) etc..
The best answer I have ever heard about using the provided type aliases in C# comes from Jeffrey Richter in his book CLR Via C#. Here are his 3 reasons:
I've seen a number of developers confused, not knowing whether to use string or String in their code. Because in C# the string (a keyword) maps exactly to System.String (an FCL type), there is no difference and either can be used.
In C#, long maps to System.Int64, but in a different programming language, long could map to an Int16 or Int32. In fact, C++/CLI does in fact treat long as an Int32. Someone reading source code in one language could easily misinterpret the code's intention if he or she were used to programming in a different programming language. In fact, most languages won't even treat long as a keyword and won't compile code that uses it.
The FCL has many methods that have type names as part of their method names. For example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32, ReadSingle, and so on, and the System.Convert type offers methods such as ToBoolean, ToInt32, ToSingle, and so on. Although it's legal to write the following code, the line with float feels very unnatural to me, and it's not obvious that the line is correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
So there you have it. I think these are all really good points. I however, don't find myself using Jeffrey's advice in my own code. Maybe I am too stuck in my C# world but I end up trying to make my code look like the framework code.
string is a reserved word, but String is just a class name.
This means that string cannot be used as a variable name by itself.
If for some reason you wanted a variable called string, you'd see only the first of these compiles:
StringBuilder String = new StringBuilder(); // compiles
StringBuilder string = new StringBuilder(); // doesn't compile
If you really want a variable name called string you can use # as a prefix:
StringBuilder #string = new StringBuilder();
Another critical difference: Stack Overflow highlights them differently.
There is one difference - you can't use String without using System; beforehand.
It's been covered above; however, you can't use string in reflection; you must use String.
System.String is the .NET string class - in C# string is an alias for System.String - so in use they are the same.
As for guidelines I wouldn't get too bogged down and just use whichever you feel like - there are more important things in life and the code is going to be the same anyway.
If you find yourselves building systems where it is necessary to specify the size of the integers you are using and so tend to use Int16, Int32, UInt16, UInt32 etc. then it might look more natural to use String - and when moving around between different .net languages it might make things more understandable - otherwise I would use string and int.
I prefer the capitalized .NET types (rather than the aliases) for formatting reasons. The .NET types are colored the same as other object types (the value types are proper objects, after all).
Conditional and control keywords (like if, switch, and return) are lowercase and colored dark blue (by default). And I would rather not have the disagreement in use and format.
Consider:
String someString;
string anotherString;
string and String are identical in all ways (except the uppercase "S"). There are no performance implications either way.
Lowercase string is preferred in most projects due to the syntax highlighting
This YouTube video demonstrates practically how they differ.
But now for a long textual answer.
When we talk about .NET there are two different things one there is .NET framework and the other there are languages (C#, VB.NET etc) which use that framework.
"System.String" a.k.a "String" (capital "S") is a .NET framework data type while "string" is a C# data type.
In short "String" is an alias (the same thing called with different names) of "string". So technically both the below code statements will give the same output.
String s = "I am String";
or
string s = "I am String";
In the same way, there are aliases for other C# data types as shown below:
object: System.Object, string: System.String, bool: System.Boolean, byte: System.Byte, sbyte: System.SByte, short: System.Int16 and so on.
Now the million-dollar question from programmer's point of view: So when to use "String" and "string"?
The first thing to avoid confusion use one of them consistently. But from best practices perspective when you do variable declaration it's good to use "string" (small "s") and when you are using it as a class name then "String" (capital "S") is preferred.
In the below code the left-hand side is a variable declaration and it is declared using "string". On the right-hand side, we are calling a method so "String" is more sensible.
string s = String.ToUpper() ;
C# is a language which is used together with the CLR.
string is a type in C#.
System.String is a type in the CLR.
When you use C# together with the CLR string will be mapped to System.String.
Theoretically, you could implement a C#-compiler that generated Java bytecode. A sensible implementation of this compiler would probably map string to java.lang.String in order to interoperate with the Java runtime library.
Lower case string is an alias for System.String.
They are the same in C#.
There's a debate over whether you should use the System types (System.Int32, System.String, etc.) types or the C# aliases (int, string, etc). I personally believe you should use the C# aliases, but that's just my personal preference.
string is just an alias for System.String. The compiler will treat them identically.
The only practical difference is the syntax highlighting as you mention, and that you have to write using System if you use String.
Both are same. But from coding guidelines perspective it's better to use string instead of String. This is what generally developers use. e.g. instead of using Int32 we use int as int is alias to Int32
FYI
“The keyword string is simply an alias for the predefined class System.String.” - C# Language Specification 4.2.3
http://msdn2.microsoft.com/En-US/library/aa691153.aspx
As the others are saying, they're the same. StyleCop rules, by default, will enforce you to use string as a C# code style best practice, except when referencing System.String static functions, such as String.Format, String.Join, String.Concat, etc...
New answer after 6 years and 5 months (procrastination).
While string is a reserved C# keyword that always has a fixed meaning, String is just an ordinary identifier which could refer to anything. Depending on members of the current type, the current namespace and the applied using directives and their placement, String could be a value or a type distinct from global::System.String.
I shall provide two examples where using directives will not help.
First, when String is a value of the current type (or a local variable):
class MySequence<TElement>
{
public IEnumerable<TElement> String { get; set; }
void Example()
{
var test = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
}
}
The above will not compile because IEnumerable<> does not have a non-static member called Format, and no extension methods apply. In the above case, it may still be possible to use String in other contexts where a type is the only possibility syntactically. For example String local = "Hi mum!"; could be OK (depending on namespace and using directives).
Worse: Saying String.Concat(someSequence) will likely (depending on usings) go to the Linq extension method Enumerable.Concat. It will not go to the static method string.Concat.
Secondly, when String is another type, nested inside the current type:
class MyPiano
{
protected class String
{
}
void Example()
{
var test1 = String.Format("Hello {0}.", DateTime.Today.DayOfWeek);
String test2 = "Goodbye";
}
}
Neither statement in the Example method compiles. Here String is always a piano string, MyPiano.String. No member (static or not) Format exists on it (or is inherited from its base class). And the value "Goodbye" cannot be converted into it.
Using System types makes it easier to port between C# and VB.Net, if you are into that sort of thing.
Against what seems to be common practice among other programmers, I prefer String over string, just to highlight the fact that String is a reference type, as Jon Skeet mentioned.
string is an alias (or shorthand) of System.String. That means, by typing string we meant System.String. You can read more in think link: 'string' is an alias/shorthand of System.String.
I'd just like to add this to lfousts answer, from Ritchers book:
The C# language specification states, “As a matter of style, use of the keyword is favored over
use of the complete system type name.” I disagree with the language specification; I prefer
to use the FCL type names and completely avoid the primitive type names. In fact, I wish that
compilers didn’t even offer the primitive type names and forced developers to use the FCL
type names instead. Here are my reasons:
I’ve seen a number of developers confused, not knowing whether to use string
or String in their code. Because in C# string (a keyword) maps exactly to
System.String (an FCL type), there is no difference and either can be used. Similarly,
I’ve heard some developers say that int represents a 32-bit integer when the application
is running on a 32-bit OS and that it represents a 64-bit integer when the application
is running on a 64-bit OS. This statement is absolutely false: in C#, an int always maps
to System.Int32, and therefore it represents a 32-bit integer regardless of the OS the
code is running on. If programmers would use Int32 in their code, then this potential
confusion is also eliminated.
In C#, long maps to System.Int64, but in a different programming language, long
could map to an Int16 or Int32. In fact, C++/CLI does treat long as an Int32.
Someone reading source code in one language could easily misinterpret the code’s
intention if he or she were used to programming in a different programming language.
In fact, most languages won’t even treat long as a keyword and won’t compile code
that uses it.
The FCL has many methods that have type names as part of their method names. For
example, the BinaryReader type offers methods such as ReadBoolean, ReadInt32,
ReadSingle, and so on, and the System.Convert type offers methods such as
ToBoolean, ToInt32, ToSingle, and so on. Although it’s legal to write the following
code, the line with float feels very unnatural to me, and it’s not obvious that the line is
correct:
BinaryReader br = new BinaryReader(...);
float val = br.ReadSingle(); // OK, but feels unnatural
Single val = br.ReadSingle(); // OK and feels good
Many programmers that use C# exclusively tend to forget that other programming
languages can be used against the CLR, and because of this, C#-isms creep into the
class library code. For example, Microsoft’s FCL is almost exclusively written in C# and
developers on the FCL team have now introduced methods into the library such as
Array’s GetLongLength, which returns an Int64 value that is a long in C# but not
in other languages (like C++/CLI). Another example is System.Linq.Enumerable’s
LongCount method.
I didn't get his opinion before I read the complete paragraph.
String (System.String) is a class in the base class library. string (lower case) is a reserved work in C# that is an alias for System.String. Int32 vs int is a similar situation as is Boolean vs. bool. These C# language specific keywords enable you to declare primitives in a style similar to C.
It's a matter of convention, really. string just looks more like C/C++ style. The general convention is to use whatever shortcuts your chosen language has provided (int/Int for Int32). This goes for "object" and decimal as well.
Theoretically this could help to port code into some future 64-bit standard in which "int" might mean Int64, but that's not the point, and I would expect any upgrade wizard to change any int references to Int32 anyway just to be safe.
#JaredPar (a developer on the C# compiler and prolific SO user!) wrote a great blog post on this issue. I think it is worth sharing here. It is a nice perspective on our subject.
string vs. String is not a style debate
[...]
The keyword string has concrete meaning in C#. It is the type System.String which exists in the core runtime assembly. The runtime intrinsically understands this type and provides the capabilities developers expect for strings in .NET. Its presence is so critical to C# that if that type doesn’t exist the compiler will exit before attempting to even parse a line of code. Hence string has a precise, unambiguous meaning in C# code.
The identifier String though has no concrete meaning in C#. It is an identifier that goes through all the name lookup rules as Widget, Student, etc … It could bind to string or it could bind to a type in another assembly entirely whose purposes may be entirely different than string. Worse it could be defined in a way such that code like String s = "hello"; continued to compile.
class TricksterString {
void Example() {
String s = "Hello World"; // Okay but probably not what you expect.
}
}
class String {
public static implicit operator String(string s) => null;
}
The actual meaning of String will always depend on name resolution.
That means it depends on all the source files in the project and all
the types defined in all the referenced assemblies. In short it
requires quite a bit of context to know what it means.
True that in the vast majority of cases String and string will bind to
the same type. But using String still means developers are leaving
their program up to interpretation in places where there is only one
correct answer. When String does bind to the wrong type it can leave
developers debugging for hours, filing bugs on the compiler team, and
generally wasting time that could’ve been saved by using string.
Another way to visualize the difference is with this sample:
string s1 = 42; // Errors 100% of the time
String s2 = 42; // Might error, might not, depends on the code
Many will argue that while this is information technically accurate using String is still fine because it’s exceedingly rare that a codebase would define a type of this name. Or that when String is defined it’s a sign of a bad codebase.
[...]
You’ll see that String is defined for a number of completely valid purposes: reflection helpers, serialization libraries, lexers, protocols, etc … For any of these libraries String vs. string has real consequences depending on where the code is used.
So remember when you see the String vs. string debate this is about semantics, not style. Choosing string gives crisp meaning to your codebase. Choosing String isn’t wrong but it’s leaving the door open for surprises in the future.
Note: I copy/pasted most of the blog posts for archive reasons. I ignore some parts, so I recommend skipping and reading the blog post if you can.
String is not a keyword and it can be used as Identifier whereas string is a keyword and cannot be used as Identifier. And in function point of view both are same.
Coming late to the party: I use the CLR types 100% of the time (well, except if forced to use the C# type, but I don't remember when the last time that was).
I originally started doing this years ago, as per the CLR books by Ritchie. It made sense to me that all CLR languages ultimately have to be able to support the set of CLR types, so using the CLR types yourself provided clearer, and possibly more "reusable" code.
Now that I've been doing it for years, it's a habit and I like the coloration that VS shows for the CLR types.
The only real downer is that auto-complete uses the C# type, so I end up re-typing automatically generated types to specify the CLR type instead.
Also, now, when I see "int" or "string", it just looks really wrong to me, like I'm looking at 1970's C code.
There is no difference.
The C# keyword string maps to the .NET type System.String - it is an alias that keeps to the naming conventions of the language.
Similarly, int maps to System.Int32.
There's a quote on this issue from Daniel Solis' book.
All the predefined types are mapped directly to
underlying .NET types. The C# type names (string) are simply aliases for the
.NET types (String or System.String), so using the .NET names works fine syntactically, although
this is discouraged. Within a C# program, you should use the C# names
rather than the .NET names.
Yes, that's no difference between them, just like the bool and Boolean.
string is a keyword, and you can't use string as an identifier.
String is not a keyword, and you can use it as an identifier:
Example
string String = "I am a string";
The keyword string is an alias for
System.String aside from the keyword issue, the two are exactly
equivalent.
typeof(string) == typeof(String) == typeof(System.String)

Categories

Resources