Can C# generics be used to elide virtual function calls? - c#

I use both C++ and C# and something that's been on my mind is whether it's possible to use generics in C# to elide virtual function calls on interfaces. Consider the following:
int Foo1(IList<int> list)
{
int sum = 0;
for(int i = 0; i < list.Count; ++i)
sum += list[i];
return sum;
}
int Foo2<T>(T list) where T : IList<int>
{
int sum = 0;
for(int i = 0; i < list.Count; ++i)
sum += list[i];
return sum;
}
/*...*/
var l = new List<int>();
Foo1(l);
Foo2(l);
Inside Foo1, every access to list.Count and list[i] causes a virtual function call. If this were C++ using templates, then in the call to Foo2 the compiler would be able to see that the virtual function call can be elided and inlined because the concrete type is known at template instantiation time.
But does the same apply to C# and generics? When you call Foo2(l), it's known at compile-time that T is a List and therefore that list.Count and list[i] don't need to involve virtual function calls. First of all, would that be a valid optimization that doesn't horribly break something? And if so, is the compiler/JIT smart enough to make this optimization?

This is an interesting question, but unfortunately, your approach to "cheat" the system won't improve the efficiency of your program. If it could, the compiler could do it for us with relative ease!
You are correct that when calling IList<T> through an interface reference, that the methods are dispatched at runtime and therefore cannot be inlined. Therefore the calls to IList<T> methods such as Count and the indexer will be called through the interface.
On the other hand, it is not true that you can achieve any performance advantage (at least not with the current C# compiler and .NET4 CLR), by rewriting it as a generic method.
Why not? First some background. The C# generics work is that the compiler compiles your generic method that has replaceable parameters and then replaces them at run-time with the actual parameters. This you already knew.
But the parameterized version of the method knows no more about the variable types than you and I do at compile time. In this case, all the compiler knows about Foo2 is that list is an IList<int>. We have the same information in the generic Foo2 that we do in the non-generic Foo1.
As a matter of fact, in order to avoid code-bloat, the JIT compiler only produces a single instantiation of the generic method for all reference types. Here is the Microsoft documentation that describes this substitution and instantiation:
If the client specifies a reference type, then the JIT compiler replaces the generic parameters in the server IL with Object, and compiles it into native code. That code will be used in any further request for a reference type instead of a generic type parameter. Note that this way the JIT compiler only reuses actual code. Instances are still allocated according to their size off the managed heap, and there is no casting.
This means that the JIT compiler's version of the method (for reference types) is not type safe but it doesn't matter because the compiler has ensured all type-safety at compile time. But more importantly for your question, there is no avenue to perform inlining and get a performance boost.
Edit: Finally, empirically, I've just done a benchmark of both Foo1 and Foo2 and they yield identical performance results. In other words, Foo2 is not any faster than Foo1.
Let's add an "inlinable" version Foo0 for comparison:
int Foo0(List<int> list)
{
int sum = 0;
for (int i = 0; i < list.Count; ++i)
sum += list[i];
return sum;
}
Here is the performance comparison:
Foo0 = 1719
Foo1 = 7299
Foo2 = 7472
Foo0 = 1671
Foo1 = 7470
Foo2 = 7756
So you can see that Foo0, which can be inlined, is dramatically faster than the other two. You can also see that Foo2 is slightly slower instead of being anywhere near as fast as Foo0.

This actually does work, and does (if the function is not virtual) result in a non-virtual call. The reason is that unlike in C++, CLR generics define, at JIT time, a specific, concrete class for each unique set of generic parameters (indicated via reflection via trailing 1, 2 etc). If the method is virtual, it will result in a virtual call like any concrete, non-virtual, non-generic method.
The thing to remember about .net generics is that given:
Foo<T>;
then
Foo<Int32>
is a valid Type at runtime, separate and distinct from
Foo<String>
, and all virtual and non-virtual methods are treated accordingly. This is the reason why you can create a
List<Vehicle>
and add a Car to it, but you can't create a variable of type
List<Vehicle>
and set its value to an instance of
List<Car>
. They are of different types, but the former has an Add(...) method that takes an argument of Vehicle, a supertype of Car.

Related

Which is more efficient: myType.GetType() or typeof(MyType)?

Supposing I have a class MyType:
sealed class MyType
{
static Type typeReference = typeof(MyType);
//...
}
Given the following code:
var instance = new MyType();
var type1 = instance.GetType();
var type2 = typeof(MyType);
var type3 = typeReference;
Which of these variable assignments would be the most efficient?
Is performance of GetType() or typeof() concerning enough that it would be beneficial to save off the type in a static field?
typeof(SomeType) is a simple metadata token lookup
GetType() is a virtual call; on the plus side you'll get the derived type if it is a subclass, but on the minus side you'll get the derived class if it is a subclass. If you see what I mean. Additionally, GetType() requires boxing for structs, and doesn't work well for nullable structs.
If you know the type at compiletime, use typeof().
I would go with type2. It doesn't require instantiating an instance to get the type. And it's the most human readable.
The only way to find out is to measure.
The "type1" variant isn't reliable or recommended in any way, since not all types can be constructed. Even worse, it allocates memory that will need to be garbage collector and invokes the object constructors.
For the remaining two options, on my machine "type3" is about twice as fast as "type1" in both debug and release modes. Remember that this is only true for my test - the results may not be true for other processor types, machine types, compilers, or .NET versions.
var sw = System.Diagnostics.Stopwatch.StartNew();
for (int i = 0; i < 10000000; i++)
{
var y = typeof(Program).ToString();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0; i < 10000000; i++)
{
var y = typeReference.ToString();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
That said, it's a bit alarming this question is being asked without a clear requirement. If you noticed a performance problem, you'd likely have already profiled it and know which option was better. That tells me that this is likely premature optimization - you know the saying, "premature optimization is the root of all evil".
Programming code is not measured only by performance. It's also measured by correctness, developer productivity, and maintainability. Increasing the complexity of your code without a strong reason just transfers that cost to somewhere else. What might have been a non-issue has now turned into a serious loss of productivity, both now and for future maintainers of the application.
My recommendation would be to always use the "type1" variant. The measurement code I listed isn't a real world scenario. Caching typeof to a reference variable likely has a ton of side-effects, particularly around the way .NET loads assemblies. Rather than having them load only when needed, it might end up loading them all one every use of the application - turning a theoretical performance optimization into a very real performance problem.
They're rather different.
typeof(MyType) gets a Type object describing the MyType type resolved in compile-type using the ldtoken instruction.
myInstance.GetType() gets the Type object describing the runtime type of the myInstance variable.
Both are intended for different scenarios.
You cannot use typeof(MyType) unless you know the type at the compile-time and have access to it.
You cannot use myInstance.GetType() unless you have an instance of the type.
typeof(MyType) is always more efficient, but you cannot use if you don't see the type at the compile time. You cannot use typeof(MyType) to learn the real runtime type of some variable, because you don't know the type.
Both basically the same. Although typeof can be used on a non-instance class like
typeof(MyClass);
But
MyClass.GetType();
won't even build since you need to have an instance of the class.
Long story short, they both do the same job in different context.

How are C# Generics implemented?

I had thought that Generics in C# were implemented such that a new class/method/what-have-you was generated, either at run-time or compile-time, when a new generic type was used, similar to C++ templates (which I've never actually looked into and I very well could be wrong, about which I'd gladly accept correction).
But in my coding I came up with an exact counterexample:
static class Program {
static void Main()
{
Test testVar = new Test();
GenericTest<Test> genericTest = new GenericTest<Test>();
int gen = genericTest.Get(testVar);
RegularTest regTest = new RegularTest();
int reg = regTest.Get(testVar);
if (gen == ((object)testVar).GetHashCode())
{
Console.WriteLine("Got Object's hashcode from GenericTest!");
}
if (reg == testVar.GetHashCode())
{
Console.WriteLine("Got Test's hashcode from RegularTest!");
}
}
class Test
{
public new int GetHashCode()
{
return 0;
}
}
class GenericTest<T>
{
public int Get(T obj)
{
return obj.GetHashCode();
}
}
class RegularTest
{
public int Get(Test obj)
{
return obj.GetHashCode();
}
}
}
Both of those console lines print.
I know that the actual reason this happens is that the virtual call to Object.GetHashCode() doesn't resolve to Test.GetHashCode() because the method in Test is marked as new rather than override. Therefore, I know if I used "override" rather than "new" on Test.GetHashCode() then the return of 0 would polymorphically override the method GetHashCode in object and this wouldn't be true, but according to my (previous) understanding of C# generics it wouldn't have mattered because every instance of T would have been replaced with Test, and thus the method call would have statically (or at generic resolution time) been resolved to the "new" method.
So my question is this: How are generics implemented in C#? I don't know CIL bytecode, but I do know Java bytecode so I understand how Object-oriented CLI languages work at a low level. Feel free to explain at that level.
As an aside, I thought C# generics were implemented that way because everyone always calls the generic system in C# "True Generics," compared to the type-erasure system of Java.
In GenericTest<T>.Get(T), the C# compiler has already picked that object.GetHashCode should be called (virtually). There's no way this will resolve to the "new" GetHashCode method at runtime (which will have its own slot in the method-table, rather than overriding the slot for object.GetHashCode).
From Eric Lippert's What's the difference, part one: Generics are not templates, the issue is explained (the setup used is slightly different, but the lessons translate well to your scenario):
This illustrates that generics in C# are not like templates in C++.
You can think of templates as a fancy-pants search-and-replace
mechanism.[...] That’s not how generic types work; generic types are,
well, generic. We do the overload resolution once and bake in the
result. [...] The IL we’ve generated for the generic type already has
the method its going to call picked out. The jitter does not say
“well, I happen to know that if we asked the C# compiler to execute
right now with this additional information then it would have picked a
different overload. Let me rewrite the generated code to ignore the
code that the C# compiler originally generated...” The jitter knows
nothing about the rules of C#.
And a workaround for your desired semantics:
Now, if you do want overload resolution to be re-executed at runtime based on the runtime types of
the arguments, we can do that for you; that’s what the new “dynamic”
feature does in C# 4.0. Just replace “object” with “dynamic” and when
you make a call involving that object, we’ll run the overload
resolution algorithm at runtime and dynamically spit code that calls
the method that the compiler would have picked, had it known all the
runtime types at compile time.

Why we require Generics? [duplicate]

I thought I'd offer this softball to whomever would like to hit it out of the park. What are generics, what are the advantages of generics, why, where, how should I use them? Please keep it fairly basic. Thanks.
Allows you to write code/use library methods which are type-safe, i.e. a List<string> is guaranteed to be a list of strings.
As a result of generics being used the compiler can perform compile-time checks on code for type safety, i.e. are you trying to put an int into that list of strings? Using an ArrayList would cause that to be a less transparent runtime error.
Faster than using objects as it either avoids boxing/unboxing (where .net has to convert value types to reference types or vice-versa) or casting from objects to the required reference type.
Allows you to write code which is applicable to many types with the same underlying behaviour, i.e. a Dictionary<string, int> uses the same underlying code as a Dictionary<DateTime, double>; using generics, the framework team only had to write one piece of code to achieve both results with the aforementioned advantages too.
I really hate to repeat myself. I hate typing the same thing more often than I have to. I don't like restating things multiple times with slight differences.
Instead of creating:
class MyObjectList {
MyObject get(int index) {...}
}
class MyOtherObjectList {
MyOtherObject get(int index) {...}
}
class AnotherObjectList {
AnotherObject get(int index) {...}
}
I can build one reusable class... (in the case where you don't want to use the raw collection for some reason)
class MyList<T> {
T get(int index) { ... }
}
I'm now 3x more efficient and I only have to maintain one copy. Why WOULDN'T you want to maintain less code?
This is also true for non-collection classes such as a Callable<T> or a Reference<T> that has to interact with other classes. Do you really want to extend Callable<T> and Future<T> and every other associated class to create type-safe versions?
I don't.
Not needing to typecast is one of the biggest advantages of Java generics, as it will perform type checking at compile-time. This will reduce the possibility of ClassCastExceptions which can be thrown at runtime, and can lead to more robust code.
But I suspect that you're fully aware of that.
Every time I look at Generics it gives
me a headache. I find the best part of
Java to be it's simplicity and minimal
syntax and generics are not simple and
add a significant amount of new
syntax.
At first, I didn't see the benefit of generics either. I started learning Java from the 1.4 syntax (even though Java 5 was out at the time) and when I encountered generics, I felt that it was more code to write, and I really didn't understand the benefits.
Modern IDEs make writing code with generics easier.
Most modern, decent IDEs are smart enough to assist with writing code with generics, especially with code completion.
Here's an example of making an Map<String, Integer> with a HashMap. The code I would have to type in is:
Map<String, Integer> m = new HashMap<String, Integer>();
And indeed, that's a lot to type just to make a new HashMap. However, in reality, I only had to type this much before Eclipse knew what I needed:
Map<String, Integer> m = new Ha Ctrl+Space
True, I did need to select HashMap from a list of candidates, but basically the IDE knew what to add, including the generic types. With the right tools, using generics isn't too bad.
In addition, since the types are known, when retrieving elements from the generic collection, the IDE will act as if that object is already an object of its declared type -- there is no need to casting for the IDE to know what the object's type is.
A key advantage of generics comes from the way it plays well with new Java 5 features. Here's an example of tossing integers in to a Set and calculating its total:
Set<Integer> set = new HashSet<Integer>();
set.add(10);
set.add(42);
int total = 0;
for (int i : set) {
total += i;
}
In that piece of code, there are three new Java 5 features present:
Generics
Autoboxing and unboxing
For-each loop
First, generics and autoboxing of primitives allow the following lines:
set.add(10);
set.add(42);
The integer 10 is autoboxed into an Integer with the value of 10. (And same for 42). Then that Integer is tossed into the Set which is known to hold Integers. Trying to throw in a String would cause a compile error.
Next, for for-each loop takes all three of those:
for (int i : set) {
total += i;
}
First, the Set containing Integers are used in a for-each loop. Each element is declared to be an int and that is allowed as the Integer is unboxed back to the primitive int. And the fact that this unboxing occurs is known because generics was used to specify that there were Integers held in the Set.
Generics can be the glue that brings together the new features introduced in Java 5, and it just makes coding simpler and safer. And most of the time IDEs are smart enough to help you with good suggestions, so generally, it won't a whole lot more typing.
And frankly, as can be seen from the Set example, I feel that utilizing Java 5 features can make the code more concise and robust.
Edit - An example without generics
The following is an illustration of the above Set example without the use of generics. It is possible, but isn't exactly pleasant:
Set set = new HashSet();
set.add(10);
set.add(42);
int total = 0;
for (Object o : set) {
total += (Integer)o;
}
(Note: The above code will generate unchecked conversion warning at compile-time.)
When using non-generics collections, the types that are entered into the collection is objects of type Object. Therefore, in this example, a Object is what is being added into the set.
set.add(10);
set.add(42);
In the above lines, autoboxing is in play -- the primitive int value 10 and 42 are being autoboxed into Integer objects, which are being added to the Set. However, keep in mind, the Integer objects are being handled as Objects, as there are no type information to help the compiler know what type the Set should expect.
for (Object o : set) {
This is the part that is crucial. The reason the for-each loop works is because the Set implements the Iterable interface, which returns an Iterator with type information, if present. (Iterator<T>, that is.)
However, since there is no type information, the Set will return an Iterator which will return the values in the Set as Objects, and that is why the element being retrieved in the for-each loop must be of type Object.
Now that the Object is retrieved from the Set, it needs to be cast to an Integer manually to perform the addition:
total += (Integer)o;
Here, a typecast is performed from an Object to an Integer. In this case, we know this will always work, but manual typecasting always makes me feel it is fragile code that could be damaged if a minor change is made else where. (I feel that every typecast is a ClassCastException waiting to happen, but I digress...)
The Integer is now unboxed into an int and allowed to perform the addition into the int variable total.
I hope I could illustrate that the new features of Java 5 is possible to use with non-generic code, but it just isn't as clean and straight-forward as writing code with generics. And, in my opinion, to take full advantage of the new features in Java 5, one should be looking into generics, if at the very least, allows for compile-time checks to prevent invalid typecasts to throw exceptions at runtime.
If you were to search the Java bug database just before 1.5 was released, you'd find seven times more bugs with NullPointerException than ClassCastException. So it doesn't seem that it is a great feature to find bugs, or at least bugs that persist after a little smoke testing.
For me the huge advantage of generics is that they document in code important type information. If I didn't want that type information documented in code, then I'd use a dynamically typed language, or at least a language with more implicit type inference.
Keeping an object's collections to itself isn't a bad style (but then the common style is to effectively ignore encapsulation). It rather depends upon what you are doing. Passing collections to "algorithms" is slightly easier to check (at or before compile-time) with generics.
Generics in Java facilitate parametric polymorphism. By means of type parameters, you can pass arguments to types. Just as a method like String foo(String s) models some behaviour, not just for a particular string, but for any string s, so a type like List<T> models some behaviour, not just for a specific type, but for any type. List<T> says that for any type T, there's a type of List whose elements are Ts. So List is a actually a type constructor. It takes a type as an argument and constructs another type as a result.
Here are a couple of examples of generic types I use every day. First, a very useful generic interface:
public interface F<A, B> {
public B f(A a);
}
This interface says that for some two types, A and B, there's a function (called f) that takes an A and returns a B. When you implement this interface, A and B can be any types you want, as long as you provide a function f that takes the former and returns the latter. Here's an example implementation of the interface:
F<Integer, String> intToString = new F<Integer, String>() {
public String f(int i) {
return String.valueOf(i);
}
}
Before generics, polymorphism was achieved by subclassing using the extends keyword. With generics, we can actually do away with subclassing and use parametric polymorphism instead. For example, consider a parameterised (generic) class used to calculate hash codes for any type. Instead of overriding Object.hashCode(), we would use a generic class like this:
public final class Hash<A> {
private final F<A, Integer> hashFunction;
public Hash(final F<A, Integer> f) {
this.hashFunction = f;
}
public int hash(A a) {
return hashFunction.f(a);
}
}
This is much more flexible than using inheritance, because we can stay with the theme of using composition and parametric polymorphism without locking down brittle hierarchies.
Java's generics are not perfect though. You can abstract over types, but you can't abstract over type constructors, for example. That is, you can say "for any type T", but you can't say "for any type T that takes a type parameter A".
I wrote an article about these limits of Java generics, here.
One huge win with generics is that they let you avoid subclassing. Subclassing tends to result in brittle class hierarchies that are awkward to extend, and classes that are difficult to understand individually without looking at the entire hierarchy.
Wereas before generics you might have classes like Widget extended by FooWidget, BarWidget, and BazWidget, with generics you can have a single generic class Widget<A> that takes a Foo, Bar or Baz in its constructor to give you Widget<Foo>, Widget<Bar>, and Widget<Baz>.
Generics avoid the performance hit of boxing and unboxing. Basically, look at ArrayList vs List<T>. Both do the same core things, but List<T> will be a lot faster because you don't have to box to/from object.
The best benefit to Generics is code reuse. Lets say that you have a lot of business objects, and you are going to write VERY similar code for each entity to perform the same actions. (I.E Linq to SQL operations).
With generics, you can create a class that will be able to operate given any of the types that inherit from a given base class or implement a given interface like so:
public interface IEntity
{
}
public class Employee : IEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int EmployeeID { get; set; }
}
public class Company : IEntity
{
public string Name { get; set; }
public string TaxID { get; set }
}
public class DataService<ENTITY, DATACONTEXT>
where ENTITY : class, IEntity, new()
where DATACONTEXT : DataContext, new()
{
public void Create(List<ENTITY> entities)
{
using (DATACONTEXT db = new DATACONTEXT())
{
Table<ENTITY> table = db.GetTable<ENTITY>();
foreach (ENTITY entity in entities)
table.InsertOnSubmit (entity);
db.SubmitChanges();
}
}
}
public class MyTest
{
public void DoSomething()
{
var dataService = new DataService<Employee, MyDataContext>();
dataService.Create(new Employee { FirstName = "Bob", LastName = "Smith", EmployeeID = 5 });
var otherDataService = new DataService<Company, MyDataContext>();
otherDataService.Create(new Company { Name = "ACME", TaxID = "123-111-2233" });
}
}
Notice the reuse of the same service given the different Types in the DoSomething method above. Truly elegant!
There's many other great reasons to use generics for your work, this is my favorite.
I just like them because they give you a quick way to define a custom type (as I use them anyway).
So for example instead of defining a structure consisting of a string and an integer, and then having to implement a whole set of objects and methods on how to access an array of those structures and so forth, you can just make a Dictionary
Dictionary<int, string> dictionary = new Dictionary<int, string>();
And the compiler/IDE does the rest of the heavy lifting. A Dictionary in particular lets you use the first type as a key (no repeated values).
Typed collections - even if you don't want to use them you're likely to have to deal with them from other libraries , other sources.
Generic typing in class creation:
public class Foo < T> {
public T get()...
Avoidance of casting - I've always disliked things like
new Comparator {
public int compareTo(Object o){
if (o instanceof classIcareAbout)...
Where you're essentially checking for a condition that should only exist because the interface is expressed in terms of objects.
My initial reaction to generics was similar to yours - "too messy, too complicated". My experience is that after using them for a bit you get used to them, and code without them feels less clearly specified, and just less comfortable. Aside from that, the rest of the java world uses them so you're going to have to get with the program eventually, right?
To give a good example. Imagine you have a class called Foo
public class Foo
{
public string Bar() { return "Bar"; }
}
Example 1
Now you want to have a collection of Foo objects. You have two options, LIst or ArrayList, both of which work in a similar manner.
Arraylist al = new ArrayList();
List<Foo> fl = new List<Foo>();
//code to add Foos
al.Add(new Foo());
f1.Add(new Foo());
In the above code, if I try to add a class of FireTruck instead of Foo, the ArrayList will add it, but the Generic List of Foo will cause an exception to be thrown.
Example two.
Now you have your two array lists and you want to call the Bar() function on each. Since hte ArrayList is filled with Objects, you have to cast them before you can call bar. But since the Generic List of Foo can only contain Foos, you can call Bar() directly on those.
foreach(object o in al)
{
Foo f = (Foo)o;
f.Bar();
}
foreach(Foo f in fl)
{
f.Bar();
}
Haven't you ever written a method (or a class) where the key concept of the method/class wasn't tightly bound to a specific data type of the parameters/instance variables (think linked list, max/min functions, binary search, etc.).
Haven't you ever wish you could reuse the algorthm/code without resorting to cut-n-paste reuse or compromising strong-typing (e.g. I want a List of Strings, not a List of things I hope are strings!)?
That's why you should want to use generics (or something better).
The primary advantage, as Mitchel points out, is strong-typing without needing to define multiple classes.
This way you can do stuff like:
List<SomeCustomClass> blah = new List<SomeCustomClass>();
blah[0].SomeCustomFunction();
Without generics, you would have to cast blah[0] to the correct type to access its functions.
Don't forget that generics aren't just used by classes, they can also be used by methods. For example, take the following snippet:
private <T extends Throwable> T logAndReturn(T t) {
logThrowable(t); // some logging method that takes a Throwable
return t;
}
It is simple, but can be used very elegantly. The nice thing is that the method returns whatever it was that it was given. This helps out when you are handling exceptions that need to be re-thrown back to the caller:
...
} catch (MyException e) {
throw logAndReturn(e);
}
The point is that you don't lose the type by passing it through a method. You can throw the correct type of exception instead of just a Throwable, which would be all you could do without generics.
This is just a simple example of one use for generic methods. There are quite a few other neat things you can do with generic methods. The coolest, in my opinion, is type inferring with generics. Take the following example (taken from Josh Bloch's Effective Java 2nd Edition):
...
Map<String, Integer> myMap = createHashMap();
...
public <K, V> Map<K, V> createHashMap() {
return new HashMap<K, V>();
}
This doesn't do a lot, but it does cut down on some clutter when the generic types are long (or nested; i.e. Map<String, List<String>>).
Generics allow you to create objects that are strongly typed, yet you don't have to define the specific type. I think the best useful example is the List and similar classes.
Using the generic list you can have a List List List whatever you want and you can always reference the strong typing, you don't have to convert or anything like you would with a Array or standard List.
the jvm casts anyway... it implicitly creates code which treats the generic type as "Object" and creates casts to the desired instantiation. Java generics are just syntactic sugar.
I know this is a C# question, but generics are used in other languages too, and their use/goals are quite similar.
Java collections use generics since Java 1.5. So, a good place to use them is when you are creating your own collection-like object.
An example I see almost everywhere is a Pair class, which holds two objects, but needs to deal with those objects in a generic way.
class Pair<F, S> {
public final F first;
public final S second;
public Pair(F f, S s)
{
first = f;
second = s;
}
}
Whenever you use this Pair class you can specify which kind of objects you want it to deal with and any type cast problems will show up at compile time, rather than runtime.
Generics can also have their bounds defined with the keywords 'super' and 'extends'. For example, if you want to deal with a generic type but you want to make sure it extends a class called Foo (which has a setTitle method):
public class FooManager <F extends Foo>{
public void setTitle(F foo, String title) {
foo.setTitle(title);
}
}
While not very interesting on its own, it's useful to know that whenever you deal with a FooManager, you know that it will handle MyClass types, and that MyClass extends Foo.
From the Sun Java documentation, in response to "why should i use generics?":
"Generics provides a way for you to communicate the type of a collection to the compiler, so that it can be checked. Once the compiler knows the element type of the collection, the compiler can check that you have used the collection consistently and can insert the correct casts on values being taken out of the collection... The code using generics is clearer and safer.... the compiler can verify at compile time that the type constraints are not violated at run time [emphasis mine]. Because the program compiles without warnings, we can state with certainty that it will not throw a ClassCastException at run time. The net effect of using generics, especially in large programs, is improved readability and robustness. [emphasis mine]"
Generics let you use strong typing for objects and data structures that should be able to hold any object. It also eliminates tedious and expensive typecasts when retrieving objects from generic structures (boxing/unboxing).
One example that uses both is a linked list. What good would a linked list class be if it could only use object Foo? To implement a linked list that can handle any kind of object, the linked list and the nodes in a hypothetical node inner class must be generic if you want the list to contain only one type of object.
If your collection contains value types, they don't need to box/unbox to objects when inserted into the collection so your performance increases dramatically. Cool add-ons like resharper can generate more code for you, like foreach loops.
Another advantage of using Generics (especially with Collections/Lists) is you get Compile Time Type Checking. This is really useful when using a Generic List instead of a List of Objects.
Single most reason is they provide Type safety
List<Customer> custCollection = new List<Customer>;
as opposed to,
object[] custCollection = new object[] { cust1, cust2 };
as a simple example.
In summary, generics allow you to specify more precisily what you intend to do (stronger typing).
This has several benefits for you:
Because the compiler knows more about what you want to do, it allows you to omit a lot of type-casting because it already knows that the type will be compatible.
This also gets you earlier feedback about the correctnes of your program. Things that previously would have failed at runtime (e.g. because an object couldn't be casted in the desired type), now fail at compile-time and you can fix the mistake before your testing-department files a cryptical bug report.
The compiler can do more optimizations, like avoiding boxing, etc.
A couple of things to add/expand on (speaking from the .NET point of view):
Generic types allow you to create role-based classes and interfaces. This has been said already in more basic terms, but I find you start to design your code with classes which are implemented in a type-agnostic way - which results in highly reusable code.
Generic arguments on methods can do the same thing, but they also help apply the "Tell Don't Ask" principle to casting, i.e. "give me what I want, and if you can't, you tell me why".
I use them for example in a GenericDao implemented with SpringORM and Hibernate which look like this
public abstract class GenericDaoHibernateImpl<T>
extends HibernateDaoSupport {
private Class<T> type;
public GenericDaoHibernateImpl(Class<T> clazz) {
type = clazz;
}
public void update(T object) {
getHibernateTemplate().update(object);
}
#SuppressWarnings("unchecked")
public Integer count() {
return ((Integer) getHibernateTemplate().execute(
new HibernateCallback() {
public Object doInHibernate(Session session) {
// Code in Hibernate for getting the count
}
}));
}
.
.
.
}
By using generics my implementations of this DAOs force the developer to pass them just the entities they are designed for by just subclassing the GenericDao
public class UserDaoHibernateImpl extends GenericDaoHibernateImpl<User> {
public UserDaoHibernateImpl() {
super(User.class); // This is for giving Hibernate a .class
// work with, as generics disappear at runtime
}
// Entity specific methods here
}
My little framework is more robust (have things like filtering, lazy-loading, searching). I just simplified here to give you an example
I, like Steve and you, said at the beginning "Too messy and complicated" but now I see its advantages
Obvious benefits like "type safety" and "no casting" are already mentioned so maybe I can talk about some other "benefits" which I hope it helps.
First of all, generics is a language-independent concept and , IMO, it might make more sense if you think about regular (runtime) polymorphism at the same time.
For example, the polymorphism as we know from object oriented design has a runtime notion in where the caller object is figured out at runtime as program execution goes and the relevant method gets called accordingly depending on the runtime type. In generics, the idea is somewhat similar but everything happens at compile time. What does that mean and how you make use of it?
(Let's stick with generic methods to keep it compact) It means that you can still have the same method on separate classes (like you did previously in polymorphic classes) but this time they're auto-generated by the compiler depend on the types set at compile time. You parametrise your methods on the type you give at compile time. So, instead of writing the methods from scratch for every single type you have as you do in runtime polymorphism (method overriding), you let compilers do the work during compilation. This has an obvious advantage since you don't need to infer all possible types that might be used in your system which makes it far more scalable without a code change.
Classes work the pretty much same way. You parametrise the type and the code is generated by the compiler.
Once you get the idea of "compile time", you can make use "bounded" types and restrict what can be passed as a parametrised type through classes/methods. So, you can control what to be passed through which is a powerful thing especially you've a framework being consumed by other people.
public interface Foo<T extends MyObject> extends Hoo<T>{
...
}
No one can set sth other than MyObject now.
Also, you can "enforce" type constraints on your method arguments which means you can make sure both your method arguments would depend on the same type.
public <T extends MyObject> foo(T t1, T t2){
...
}
Hope all of this makes sense.
I once gave a talk on this topic. You can find my slides, code, and audio recording at http://www.adventuresinsoftware.com/generics/.
Using generics for collections is just simple and clean. Even if you punt on it everywhere else, the gain from the collections is a win to me.
List<Stuff> stuffList = getStuff();
for(Stuff stuff : stuffList) {
stuff.do();
}
vs
List stuffList = getStuff();
Iterator i = stuffList.iterator();
while(i.hasNext()) {
Stuff stuff = (Stuff)i.next();
stuff.do();
}
or
List stuffList = getStuff();
for(int i = 0; i < stuffList.size(); i++) {
Stuff stuff = (Stuff)stuffList.get(i);
stuff.do();
}
That alone is worth the marginal "cost" of generics, and you don't have to be a generic Guru to use this and get value.
Generics also give you the ability to create more reusable objects/methods while still providing type specific support. You also gain a lot of performance in some cases. I don't know the full spec on the Java Generics, but in .NET I can specify constraints on the Type parameter, like Implements a Interface, Constructor , and Derivation.
Enabling programmers to implement generic algorithms - By using generics, programmers can implement generic algorithms that work on collections of different types, can be customized, and are type-safe and easier to read.
Stronger type checks at compile time - A Java compiler applies strong type checking to generic code and issues errors if the code violates type safety. Fixing compile-time errors is easier than fixing runtime errors, which can be difficult to find.
Elimination of casts.

C# Dynamic Keyword — Run-time penalty?

Does defining an instance as dynamic in C# mean:
The compiler does not perform compile-time type checking, but run-time checking takes place like it always does for all instances.
The compiler does not perform compile-time type checking, but run-time checking takes place, unlike with any other non-dynamic instances.
Same as 2, and this comes with performance penalty (trivial? potentially significant?).
The question is very confusing.
Does defining an instance as dynamic in C# mean:
By "defining an instance" do you mean "declaring a variable"?
The compiler does not perform compile-time type checking, but run-time checking takes place like it always does for all instances.
What do you mean by "run-time checking like it always does"? What run-time checking did you have in mind? Are you thinking of the checking performed by the IL verifier, or are you thinking of runtime type checks caused by casts, or what?
Perhaps it would be best to simply explain what "dynamic" does.
First off, dynamic is from the perspective of the compiler a type. From the perspective of the CLR, there is no such thing as dynamic; by the time the code actually runs, all instances of "dynamic" have been replaced with "object" in the generated code.
The compiler treats expressions of type dynamic exactly as expressions of type object, except that all operations on the value of that expression are analyzed, compiled and executed at runtime based on the runtime type of the instance. The goal is that the code executed has the same semantics as if the compiler had known the runtime types at compile time.
Your question seems to be about performance.
The best way to answer performance questions is to try it and find out - what you should do if you need hard numbers is to write the code both ways, using dynamic and using known types, and then get out a stopwatch and compare the timings. That's the only way to know.
However, let's consider the performance implications of some operations at an abstract level. Suppose you have:
int x = 123;
int y = 456;
int z = x + y;
Adding two integers takes about a billionth of a second on most hardware these days.
What happens if we make it dynamic?
dynamic x = 123;
dynamic y = 456;
dynamic z = x + y;
Now what does this do at runtime? This boxes 123 and 456 into objects, which allocates memory on the heap and does some copies.
Then it starts up the DLR and asks the DLR "has this code site been compiled once already with the types for x and y being int and int?"
The answer in this case is no. The DLR then starts up a special version of the C# compiler which analyzes the addition expression, performs overload resolution, and spits out an expression tree describing the lambda which adds together two ints. The DLR then compiles that lambda into dynamically generated IL, which the jit compiler then jits. The DLR then caches that compiled state so that the second time you ask, the compiler doesn't have to do all that work over again.
That takes longer than a nanosecond. It takes potentially many thousands of nanoseconds.
Does that answer your questions? I don't really understand what you're asking here but I'm making a best guess.
As far as I know, the answer is 3.
You can do this:
dynamic x = GetMysteriousObject();
x.DoLaundry();
Since the compiler does no type checking on x, it will compile this code, the assumption being that you know what you're doing.
But this means extra run-time checking has to occur: namely, examining x's type, seeing if it has a DoLaundry method accepting no arguments, and executing it.
In other words the above code is sort of like doing this (I'm not saying it's the same, just drawing a comparison):
object x = GetMysteriousObject();
MethodInfo doLaundry = x.GetType().GetMethod(
"DoLaundry",
BindingFlags.Instance | BindingFlags.Public
);
doLaundry.Invoke(x, null);
This is definitely not trivial, though that isn't to say you're going to be able to see a performance issue with your naked eye.
I believe the implementation of dynamic involves some pretty sweet behind-the-scenes caching that gets done for you, so that if you run this code again and x is the same type, it'll run a lot faster.
Don't hold me to that, though. I don't have all that much experience with dynamic; this is merely how I understand it to work.
Declaring a variable as dynamic is similar to declaring it as object. Dynamic simply gets another flag indicating that member resolution gets deferred to run-time.
In terms of the performance penalty - it depends on what the underlying object is. That's the whole point of dynamic objects right? The underlying object can be a Ruby or Python object or it can be a C# object. The DLR will figure out at run-time how to resolve member calls on this object and this resolution method will determine the performance penalty.
Having said that - there definitely is a performance penalty.
That's why we're not simply going to start using dynamic objects all over the place.
I made a simple test: 100000000 assignments to a variable as a dynamic vs. the same number of direct double assignments, something like
int numberOfIterations = 100000000;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < numberOfIterations; i++)
{
var x = (dynamic)2.87;
}
sw.Stop();
sw.Restart();
for (int i = 0; i < numberOfIterations; i++)
{
double y = 2.87;
}
sw.Stop();
In the first loop (with dynamic) it took some 500ms; in the second one about 200ms. Certainly, the performance loss depends of what you do in your loops, these representing a simplest action possible.
Well, the variable is statically typed to be of the type dynamic but beyond that the compiler doesn't do any checking as far as I know.
Type binding is done at runtime and yes, there's a penalty, but if dynamic is the only option then so what. If you can solve the problem using static typing do so. That being said, the DLR does call site caching which means some of the overhead is reduced as the plumbing can be reused in some cases.
As far i undesrtand dynamic it only bypasses compile time check. resolution of type happens at runtime as it does for all types. so i dont think there is any performance penalty associated with it.

C# compiler doesn’t optimize unnecessary casts

A few days back, while writing an answer for this question here on overflow I got a bit surprised by the C# compiler, who wasn’t doing what I expected it to do. Look at the following to code snippets:
First:
object[] array = new object[1];
for (int i = 0; i < 100000; i++)
{
ICollection<object> col = (ICollection<object>)array;
col.Contains(null);
}
Second:
object[] array = new object[1];
for (int i = 0; i < 100000; i++)
{
ICollection<object> col = array;
col.Contains(null);
}
The only difference in code between the two snippets is the cast to ICollection<object>. Because object[] implements the ICollection<object> interface explicitly, I expected the two snippets to compile down to the same IL and be, therefore, identical. However, when running performance tests on them, I noticed the latter to be about 6 times as fast as the former.
After comparing the IL from both snippets, I noticed the both methods were identical, except for a castclass IL instruction in the first snippet.
Surprised by this I now wonder why the C# compiler isn’t ‘smart’ here. Things are never as simple as it seems, so why is the C# compiler a bit naïve here?
My guess is that you have discovered a minor bug in the optimizer. There is all kinds of special-case code in there for arrays. Thanks for bringing it to my attention.
This is a rough guess, but i think it's about the Array's relationship to its generic IEnumerable.
In the .NET Framework version 2.0, the
Array class implements the
System.Collections.Generic.IList,
System.Collections.Generic.ICollection,
and
System.Collections.Generic.IEnumerable
generic interfaces. The
implementations are provided to arrays
at run time, and therefore are not
visible to the documentation build
tools. As a result, the generic
interfaces do not appear in the
declaration syntax for the Array
class, and there are no reference
topics for interface members that are
accessible only by casting an array to
the generic interface type (explicit
interface implementations). The key
thing to be aware of when you cast an
array to one of these interfaces is
that members which add, insert, or
remove elements throw
NotSupportedException.
See MSDN Article.
It's not clear whether this relates to .NET 2.0+, but in this special case it would make perfect sense why the compiler cannot optimize your expression if it only becomes valid at run time.
This doesn't look like more than just a missed opportunity in the compiler to suppress the cast. It will work if you write it like this:
ICollection<object> col = array as ICollection<object>;
which suggests that it gets too conservative because casts can throw exceptions. However, it does work when you cast to the non-generic ICollection. I'd conclude that they simply overlooked it.
There's a bigger optimization issue at work here, the JIT compiler doesn't apply the loop invariant hoisting optimization. It should have re-written the code like this:
object[] array = new object[1];
ICollection<object> col = (ICollection<object>)array;
for (int i = 0; i < 100000; i++)
{
col.Contains(null);
}
Which is a standard optimization in the C/C++ code generator for example. Still, the JIT optimizer can't burn a lot of cycles on the kind of analysis required to discover such possible optimizations. The happy angle on this is that optimized managed code is still quite debuggable. And that there still is a role for the C# programmer to write performant code.

Categories

Resources