I would like to know whether the usage of Attributes in .Net, specifically C#, is expensive, and why or why not?
I am asking about C# specifically, unless there is no difference between the different .Net languages (because the base class libraries are the same?).
All the newer .Net technologies make extensive use of attributes, such as Linq to SQL, ASP.Net MVC, WCF, Enterprise Library, etc, and I was wondering what effect this would have on performance. Alot of the classes get automatically decorated with certain Attributes, or these attributes are required for certain functionality/features.
Does the question of expense depend on implementation specific details? How are Attributes compiled to IL? Are they cached automatically, or is this up to the implementor?
"The usage of attributes" is too vague. Fetching the attributes is a reflection operation effectively - you wouldn't want to regularly do it in a loop - but they're not expensive to include in the metadata, and the typical usage pattern (IMO) is to build some other representation (e.g. an in-memory schema) after reading the attributes once.
There may well be some caching involved, but I'd probably cache the other representation anyway. For example, if I were decorating enum values with descriptions, I'd generally fetch the attributes once to build a string to enum dictionary (or vice versa).
It depends on how you use them... Some attributes are just for information purpose (ObsoleteAttribute for instance), so they don't have any impact on runtime performance. Other attributes are used by the compiler (like DllImportAttribute) or by post-compilers like PostSharp, so the cost is at compile time, not run-time. However, if you use reflection to inspect attributes at runtime, it can be expensive.
Related
As shown here, attribute constructors are not called until you reflect to get the attribute values. However, as you may also know, you can only pass compile-time constant values to attribute constructors. Why is this? I think many people would much prefer to do something like this:
[MyAttribute(new MyClass(foo, bar, baz, jQuery)]
than passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value, and instead of using compile-time warnings/errors depending on exceptions that might be thrown somewhere that has nothing to do with the class except that a method that it called uses some attributes that were typed wrong.
What limitation caused this?
Attributes are part of metadata. You need to be able to reflect on metadata in an assembly without running code in that assembly.
Imagine for example that you are writing a compiler that needs to read attributes from an assembly in order to compile some source code. Do you really want the code in the referenced assembly to be loaded and executed? Do you want to put a requirement on compiler writers that they write compilers that can run arbitrary code in referenced assemblies during the compilation? Code that might crash, or go into infinite loops, or contact databases that the developer doesn't have permission to talk to? The number of awful scenarios is huge and we eliminate all of them by requiring that attributes be dead simple.
The issue is with the constructor arguments. They need to come from somewhere, they are not supplied by code that consumes the attribute. They must be supplied by the Reflection plumbing when it creates the attribute object by calling its constructor. For which it needs the constructor argument values.
This starts at compile time with the compiler parsing the attribute and recording the constructor arguments. It stores those argument values in the assembly metadata in a binary format. At issue then is that the runtime needs a highly standardized way to deserialize those values, one that preferably doesn't depend on any of the .NET classes that you'd normally use the de/serialize data. Because there's no guarantee that such classes are actually available at runtime, they won't be in a very trimmed version of .NET like the Micro Framework.
Even something as common as binary serialization with the BinaryFormatter class is troublesome, note how it requires the [Serializable] attribute on the class to allow it to do its job. Versioning would also be an enormous problem, clearly such a serializer class could never change for the risk of breaking attributes in old assemblies.
This is a rock and a hard place, solved by the CLS designers by heavily restricting the allowed types for an attribute constructor. They didn't leave much, just the simple values types, string, a simple one-dimensional array of them and Type. Never a problem deserializing them since their binary representation is simple and unambiguous. Quite a restriction but attributes can still be pretty expressive. The ultimate fallback is to use a string and decode that string in the constructor at runtime. Creating an object of MyClass isn't an issue, you can do so in the attribute constructor. You'll have to encode the arguments that this constructor needs however as properties of the attribute.
The probably most correct answer as to why you can only use constants for attributes is because the C#/BCL design team did not judge supporting anything else important enough to be added (i.e. not worth the effort).
When you build, the C# compiler will instantiate the attributes you have placed in your code and serialize them, so that they can be stored in the generated assembly. It was probably more important to ensure that attributes can be retrieved quickly and reliably than it was to support more complex scenarios.
Also, code that fails because some attribute property value is wrong is much easier to debug than some framework-internal deserialization error. Consider what would happen if the class definition for MyClass was defined in an external assembly - you compile and embed one version, then update the class definition for MyClass and run your application: boom!
On the other hand, it's seriously frustrating that DateTime instances are not constants.
What limitation caused this?
The reason it isn't possible to do what you describe is probably not caused by any limitation, but it's purely a language design decision. Basically, when designing the language they said "this should be possible but not this". If they really wanted this to be possible, the "limitations" would have been dealt with and this would be possible. I don't know the specific reasoning behind this decision though.
/.../ passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value /.../
I have been in similar situations. I sometimes wanted to use attributes with lambda expressions to implement something in a functional way. But after all, c# is not a functional language, and if I wrote the code in a non-functional way I haven't had the need for such attributes.
In short, I think like this: If I want to develop this in a functional way, I should use a functional language like f#. Now I use c# and I do it in a non-functional way, and then I don't need such attributes.
Perhaps you should simply reconsider your design and not use the attributes like you currently do.
UPDATE 1:
I claimed c# is not a functional language, but that is a subjective view and there is no rigourous definition of "Functional Language". I agree with the Adam Wright, "/.../ As such, I wouldn't class C# as functional in general discussion - it is, at best, multi-paradigm, with some functional flavour." at Why is C# a functional programmming language?
UPDATE 2:
I found this post by Jon Skeet: https://stackoverflow.com/a/294259/1105687 It regards not allowing generic attribute types, but the reasoning could be similar in this case:
Answer from Eric Lippert (paraphrased): no particular reason, except
to avoid complexity in both the language and compiler for a use case
which doesn't add much value.
I am looking to build a small application to talk with a ruby msgpack server in C#. My only holdup so far is that the API behind the server is expecting to pull out a ruby hash. Can I use a simple dictionary/key-value pair type in C#? If not, what would you suggest?
I will be using the library mentioned on the msgpack website (http://wiki.msgpack.org/display/MSGPACK/QuickStart+for+C+Sharp). However, it only seems to support primitive types? I have tried to go the IronRuby way, however there is a very crippling bug in mono that prevents you from using it. https://bugzilla.xamarin.com/show_bug.cgi?id=2770
It is normal that different part of the system can be built using different technology stacks. Because these parts should be able to talk to each other (this way or another) it is important to specify contracts between subsystems.
It is really important to think first about these contracts as these parts of your system (subsystems) can be (and will be, no doubt) subjects of changes (due to evolving their business logic, bug fixes, etc.).
By having these contracts you allow subsystems to be changed independently without impacting all their "clients" (other subsystems). Otherwise you will end up with "I need to fix this, but may affect tonnes of places I even don't know about" syndrome.
Well, as soon as you hold the contract you can do whatever you want within the given subsystem, which is just a Heaven! :)
This means that instead of "pulling out the ruby hash" you normally want to define a platform-agnostic contract that will be exposed as an aspect in terms of the business logic of your application. This contract then can be consumed by any other subsystem written in any technology.
It also means that instead of just passing some data between subsystems you want to pass some objects. These objects not only contain the data you want to pass, but also describe this data, dive it some meaning. By this "description" I mean the object type, property names, ect. Objects are self-descriptive, you know.
You may declare the contract for your ruby subsystem saying "I accept these queries and I return these results". Both query (method) and result (object) should be formulated in terms of business logic of the specified subsystem. For example, GetProducts contract should probably return a list of Product objects, not some meaningless "ruby hashes". So all the consumers will know what the contract is and what to expect.
You can make it a standard then, saying "between subsystems all the objects passed are serialized to JSON (or XML)", which is more than trivial in Ruby, C# or any other language, as well as truly platform-agnostic.
Therefore, back to your question, you normally just have no such problem in your live as translating ruby types into .NET types using some buggy libraries, or doing similar crazy things :)
Simply defining contracts and standardizing transport (JSON?) helps you in many ways starting from getting rid of this problem and all the way through to having the clean and easily maintainable system.
Could anyone explain the benefits (or reasons) to use custom attributes in your code. Of course I use (and understand the purpose of) defined attributes in certain scenarios (WCF, Serialization etc.), but I cannot imagine any algorithms where I would need to create and use my own custom attributes. Could someone provide a real-world case where usages of custom defined attributes bring something to a project.
The same reason as for WCF etc, but something that's specific to your project - you want to add some metadata to some members (types, fields, methods, whatever) to specify something about the mechanism involved, and it's not something which is covered by existing attributes.
For example, NUnit wanted to add their own indication that a particular type contained unit tests - there was no such existing attribute, so they created TestFixtureAttribute.
It's a relatively rare event, sure - but it can happen.
If you want to write your own system like WCF, Serialization, etc...
If you write code that iterates over types or members and does things with them, you will frequently want to use your own custom attributes to mark some members as being different or special.
I regularly use custom .Net attributes to support tooling in my infrastructure. One example was from very early in the .Net days (C# 1.0 to be exact). I was working on a research project which had a native C++ front and a brand new C# back end written by yours truly.
The front and back end shared a very similar object model which was evolving very rapidly. Not wanting to have to hand code both a C++ front end model, C++ serialization mechanism and a C# serialization mechanism I chose instead to attribute my C# types with custom attributes. They told me the parts of the model which were shared between the front and back end.
Once those attributes were in place I wrote a quick and dirty tool which
Parsed out the attributes to construct the core shared model
Generated the C# serialization code
Generated the C++ code
Generated the C++ serialization code
This made it dirt simple to keep my model up to date between my 2 projects. Just change the C# code, compile and re-run my tool.
I have used annotations in a custom AOP (Aspect-Oriented Programming) system I developed a while back. Attributes are also very useful for controlling orthogonal concerns like code generation.
Custom validation is a very good use case and can be seen from these links:
http://odetocode.com/blogs/scott/archive/2011/02/21/custom-data-annotation-validator-part-i-server-code.aspx
How to create Custom Data Annotation Validators
They can be used for marking tests, as in MBUnit for example. They can also be useful for code that inspects and loads classes (like a Plugin system) to provide meta-information.
They are really useful in building object mappers / ORM tools as well. If you ever decide to roll your own mapping system they are almost "required" to get all the functionality one would need. It's used more for making methods / classes more generic and using reflection to determine how to handle objects / select objects /etc...
To give you a specific case where I've used them. I once had to interact with a Mainframe screenscraper. I created a custom attribute to annotate which fields I wanted to send from my classes to the Mainframe, names that fell outside of conventions, special rules to deal with formatting and collections. I then had a class which was able to reflect over instances and realise which subset of fields were needed to interact with the mainframe screen scraper appropriately.
I am planning to use dynamic keyword for my new project. But before stepping in, I would like to know about the pros and cons in using dynamic keyword over Reflection.
Following where the pros, I could find in respect to dynamic keyword:
Readable\Maintainable code.
Fewer lines of code.
While the negatives associated with using dynamic keyword, I came to hear was like:
Affects application performance.
Dynamic keyword is internally a wrapper of Reflection.
Dynamic typing might turn into breeding ground for hard to find bugs.
Affects interoperability with previous .NET versions.
Please help me on whether the pros and cons I came across are sensible or not?
Please help me on whether the pros and cons I came across are sensible or not?
The concern I have with your pros and cons is that some of them do not address differences between using reflection and using dynamic. That dynamic typing makes for bugs that are not caught until runtime is true of any dynamic typing system. Reflection code is just as likely to have a bug as code that uses the dynamic type.
Rather than thinking of it in terms of pros and cons, think about it in more neutral terms. The question I'd ask is "What are the differences between using Reflection and using the dynamic type?"
First: with Reflection you get exactly what you asked for. With dynamic, you get what the C# compiler would have done had it been given the type information at compile time. Those are potentially two completely different things. If you have a MethodInfo to a particular method, and you invoke that method with a particular argument, then that is the method that gets invoked, period. If you use "dynamic", then you are asking the DLR to work out at runtime what the C# compiler's opinion is about which is the right method to call. The C# compiler might pick a method different than the one you actually wanted.
Second: with Reflection you can (if your code is granted suitably high levels of trust) do private reflection. You can invoke private methods, read private fields, and so on. Whether doing so is a good idea, I don't know. It certainly seems dangerous and foolish to me, but I don't know what your application is. With dynamic, you get the behaviour that you'd get from the C# compiler; private methods and fields are not visible.
Third: with Reflection, the code you write looks like a mechanism. It looks like you are loading a metadata source, extracting some types, extracting some method infos, and invoking methods on receiver objects through the method info. Every step of the way looks like the operation of a mechanism. With dynamic, every step of the way looks like business logic. You invoke a method on a receiver the same way as you'd do it in any other code. What is important? In some code, the mechanism is actually the most important thing. In some code, the business logic that the mechanism implements is the most important thing. Choose the technique that emphasises the right level of abstraction.
Fourth: the performance costs are different. With Reflection you do not get any cached behaviour, which means that operations are generally slower, but there is no memory cost for maintaining the cache and every operation is roughly the same cost. With the DLR, the first operation is very slow indeed as it does a huge amount of analysis, but the analysis is cached and reused. That consumes memory, in exchange for increased speed in subsequent calls in some scenarios. What the right balance of speed and memory usage is for your application, I don't know.
Readable\Maintainable code
Certainly true in my experence.
Fewer lines of code.
Not significantly, but it will help.
Affects application performance.
Very slightly. But not even close to the way reflection does.
Dynamic keyword is internally a wrapper of Reflection.
Completely untrue. The dynamic keyword leverages the Dynamic Library Runtime.
[Edit: correction as per comment below]
It would seem that the Dynamic Language Runtime does use Reflection and the performance improvements are only due to cacheing techniques.
Dynamic typing might turn into breeding ground for hard to find bugs.
This may be true; it depends how you write your code. You are effectively removing compiler checking from your code. If your test coverage is good, this probably won't matter; if not then I suspect you will run into problems.
Affects interoperability with previous .NET versions
Not true. I mean you won't be able to compile your code against older versions, but if you want to do that then you should use the old versions as a base and up-compile it rather than the other way around. But if you want to use a .NET 2 library then you shouldn't run into too many problems, as long as you include the declaration in app.config / web.config.
One significant pro that you're missing is the improved interoperability with COM/ATL components.
There are 4 great differences between Dynamic and reflection. Below is a detailed explanation of the same. Reference http://www.codeproject.com/Articles/593881/What-is-the-difference-between-Reflection-and-Dyna
Point 1. Inspect VS Invoke
Reflection can do two things one is it can inspect meta-data and second it also has the ability to invoke methods on runtime.While in Dynamic we can only invoke methods. So if i am creating software's like visual studio IDE then reflection is the way to go. If i just want dynamic invocation from the my c# code, dynamic is the best option.
Point 2. Private Vs Public Invoke
You can not invoke private methods using dynamic. In reflection its possible to invoke private methods.
Point 3. Caching
Dynamic uses reflection internally and it also adds caching benefits. So if you want to just invoke a object dynamically then Dynamic is the best as you get performance benefits.
Point 4. Static classes
Dynamic is instance specific: You don't have access to static members; you have to use Reflection in those scenarios.
In most cases, using the dynamic keyword will not result in meaningfully shorter code. In some cases it will; that depends on the provider and as such it's an important distinction. You should probably never use the dynamic keyword to access plain CLR objects; the benefit there is too small.
The dynamic keyword undermines automatic refactoring tools and makes high-coverage unit tests more important; after all, the compiler isn't checking much of anything when you use it. That's not as much of an issue when you're interoperating with a very stable or inherently dynamically typed API, but it's particularly nasty if you use keyword dynamic to access a library whose API might change in the future (such as any code you yourself write).
Use the keyword sparingly, where it makes sense, and make sure such code has ample unit tests. Don't use it where it's not needed or where type inference (e.g. var) can do the same.
Edit: You mention below that you're doing this for plug-ins. The Managed Extensibility Framework was designed with this in mind - it may be a better option that keyword dynamic and reflection.
If you are using dynamic specifically to do reflection your only concern is compatibility with previous versions. Otherwise it wins over reflection because it is more readable and shorter. You will lose strong typing and (some) performance from the very use of reflection anyway.
The way I see it all your cons for using dynamic except interoperability with older .NET versions are also present when using Reflection:
Affects application performance
While it does affect the performance, so does using Reflection. From what I remember the DLR more or less uses Reflection the first time you access a method/property of your dynamic object for a given type and caches the type/access target pair so that later access is just a lookup in the cache making it faster then Reflection
Dynamic keyword is internally a wrapper of Reflection
Even if it was true (see above), how would that be a negative point? Whether or not it does wrap Reflection shouldn't influence your application in any significant matter.
Dynamic typing might turn into breeding ground for hard to find bugs
While this is true, as long as you use it sparingly it shouldn't be that much of a problem. Furthermore is you basically use it as a replacement for reflection (that is you use dynamic only for the briefest possible durations when you want to access something via reflection), the risk of such bugs shouldn't be significantly higher then if you use reflection to access your methods/properties (of course if you make everything dynamic it can be more of a problem).
Affects interoperability with previous .NET versions
For that you have to decide yourself how much of a concern it is for you.
We have been using BinarySerialization with our C# app, but the size and complexity of the classes which need to be serialized results in sloooooow (de)serialization, and large files.
We suspect that we should just write our own custom serializers; but protobuf-net claims significant speed and size advantages over standard .Net binary serialization, and may be easier to add to our app than a large number of bespoke serializers.
Before spending significant time and effort trying to get it to work for us, I would love to know whether there are any deal-breakers. We are using properties defined with interfaces, generic lists of abstract sub-classes, custom bit flag enums, etc etc etc. What would stop protobuf-net working for us?
protobuf-net does what it can to adhere to the core protobuf spec, and then some (for example, it includes inheritance), however:
v1 is not very good at interface-based properties (i.e. ICustomer etc); I'm working on getting this improved in v2
v1 likes there to be a parameterless constructor (this requirement is lifted in v2)
you need to tell it how to map the model to fields; in v1 this needs to be decorated on the type (or there is an option to infer some things from the names etc); in v2 this can be done externally
in v1, flags enums are a pain; in v2 there is an option to pass-thru enums as raw integers, making it much more suitable for falgs
abstracts and inheritance are fine, but you must be able to determine all the concrete types ahead of time (to map them to integer keys)
generics should be fine
jagged arrays / nested lists without intermediate types aren't OK - you can shim this by introducing an intermediate type in the middle
not all core types have inbuilt support (the new date/time offset types, for example); in "v2" you can introduce your own shims for this if necessary
it is a tree serializer, not a graph serializer; I have some thoughts there, but nothing implemented yet
If there is some limited example of what you want to serialize, I'll happily take a look to see if it is likely to work (I'm the author).
It's not appropriate when you have to interact with existing software / an existing standard. For example, you can't use it to communicate with an SMTP server.
Please read this here on a blog about protobuf-net, to quote
What’s the catch?
In the most part, that’s it. WCF will use protobuf-net for any suitable
objects (data-contracts etc). Note that this is a coarser brush than the
per-operation control, though (you could always split the interface into
different endpoints, of course).
Also, protobuf-net does have some subtle differences (especially regarding empty
objects), so run your unit tests etc.
Note that it only works on the full-fat WCF; it won’t help Silverlight etc, since
it lacks the extension features – but that isn’t new here.
Finally, the resolver in WCF is a pain, and AFAIK wants the full assembly details
including version number; so one more thing to maintain when you get new versions.
If anyone knows how to get around this?