Compile Time Reflection in C# - c#

I frequently write C# code that has to use magic strings to express property names. Everyone knows the problems with magic strings. They are very difficult to refactor, they have no compile time checking, and often they lead to hard-to-diagnose issues. Yet C#/.NET uses them all over the place to represent property/class/method names.
This issue has persisted for years and years, and the only viable solution currently is to use an expression tree which is then parsed at run-time for the property name. This gets you satisfactory compile-time checking, but it complicates the code (requiring parameters of type Expression), and it incurs a run-time cost.
Does anyone know if there has ever been a feature consideration for C#/.NET to add compile-time reflection to overcome this pervasive problem?
It seems like it would be an easy addition to make, it would be a non-breaking change, and it would greatly benefit many developers. The typeof() operator already performs a form of compile-time reflection, so it seems like an operator nameof() (or something similar) would be very complimentary.
In addition, does anyone know of any potential issues with such a feature?
Thanks for the help.

Straight from the source - this is a blog post by a C# language designer, and the "User" in this post asks about the same questions as you and is answered. The author says there would be a need to specify a syntax for every metadata item you'd want to ask for and it's not trivial - ie. which overload you want, if you want "info-of" method and the method is overloaded? What if there are generics and explicit interface implementations involved? And so on. It turns out, while it wasn't deemed worthy of implementation in 2009 because of those reasons, we will get it in C# 6 in 2015 - see C# Language Design Notes for Jul 9, 2014 .

In C# 6.0, a new operator, nameof, is being added that will allow you to get the names of properties, classes, fields, events, and variables at compile time.
Link to the design notes
No more reflection for information the compiler already knows at design time!

I was having a similar problem. Only recently discovered that .NET Framework 4.5
has a feature called the Caller Info attributes. By using these, you can obtain information about the caller to a method at compile time. You can obtain file path of the source code, the line number in the source code, and the member name of the caller.
public void DoProcessing()
{
TraceMessage("Something happened.");
}
public void TraceMessage(string message,
[CallerMemberName] string memberName = "",
[CallerFilePath] string sourceFilePath = "",
[CallerLineNumber] int sourceLineNumber = 0)
{
Trace.WriteLine("message: " + message);
Trace.WriteLine("member name: " + memberName);
Trace.WriteLine("source file path: " + sourceFilePath);
Trace.WriteLine("source line number: " + sourceLineNumber);
}

Yet C#/.NET uses them all over the place to represent property/class/method names.
First off: I disagree. There are certain frameworks (WebForms, e.g.) that use magic strings all over the place, but the base libraries for C# and .NET tend to avoid such things remarkably well.
Secondly: In many instances where magic strings are used, ReSharper is able to recognize errors. This can help quite a bit.
Finally: What you're asking for may be possible via the Roslyn Compiler, which promises to provide "Compiling as a service."

Related

How to weave C# code to intercept call to constructors ? Maybe a custom preprocessor or Roslyn

Is there any solution similar to [PostSharp] - [Infuse - A Precompiler for C#] that let me modify code at compile time ?
The below is a pseudo code.
[InterceptCallToConstructors]
void Method1(){
Person Eric = new Person("Eric Bush");
}
InterceptCallToConstructors(ConstructorMethodArgs args){
if(args.Type == typeof(Person))
if(PersonInstances++ > 10 ) args.ReturnValue = null;
}
In this example we see the Eric should not contain a new Person class if more than 10 Person are created.
After some research I found two solution PostSharp and Infuse.
With Infuse it's very complicated and hard to detect how many instance of Person are made how ever with PostSharp it's one line code to detect.
I have tried to go AOP with PostSharp but PostSharp currently doesn't support to intercept Call To Constructor Aspect.
As far as I read Roslyn doesn't support to modify code at compile time.
This would be a "custom preprocessor" answer, that modifies the source code to achieve OP's effect.
Our DMS Software Reengineering Toolkit with its C# Front End can do this.
DMS provides for source to source transformations, with the transformations coded as
if you see *this*, replace it by *that*
This is written in the form:
rule xxx pattern_parameters
this_pattern
-> that_pattern ;
The "->" is pronounced "replace by: :-}
DMS operates on ASTs, so includes a parsing step (text to ASTs), a tree transformation step, and a prettyprinting step that produces the final answer (ASTs to text).
OP's seems to want to modify the constructor call site (he can't modify the constructor; there's no way to get it to return "null"). To accomplish OP's, task, he would provide DMS the following source-to-source transformation specification:
default domain CSharp~v5; -- says we are working with C# syntax (and need the C# front end)
rule intercept_constructor(c: IDENTIFIER, a:arguments): expression
" new \c (\a) "
-> " \c.PersonInstances==10?null:(PersonInstances++,new \c (\a)) "
if c == "Person"; -- one might want to force c to be on some qualified path
What the rule does is find matching constructor call syntax of arbitrary form, and replace it by a conditional expression that check's OP's precondition, returning null if there are too many Person instances (we fix a bug in OP's spec here; he appears to increment the count whether new Person instance is created or not, surely not his intention). We have to qualify the PersonInstance's location; it can't just be floating around in the ether. In this example I'm proposing it is a static member of the class.
The details: each rule has a name ("intercept_constructor", stolen from OP). It refers to a syntactic category ("expression") with syntactic shape "new \c (\a)", forcing it to match only constructor calls that are expressions. The quotes in the rule are meta-quotes; they distinguish the syntax of the rule language from the syntax of the targeted language (C# in this case). The backslashes are meta-escapes; \c in meta-quotes is the same think in the rule as c outside the meta-quotes, similarly for \a.
In a really big system there may be several Person classes. We want to make sure we get right one; one might need to qualify the referenced class as being a specific by by providing a path. OP hints at this with the annotation. If one wanted to check that an annotation existed on the containing method, one would need custom special predicate to ask for that. DMS provides complete facilities for coding such a predicate, including complete access the the AST, so the predicate can climb up or down in its search for a matching annotation.
If you're running on top of the KRuntime (-> ASP.NET 5) you can hook into the compilation by implementing the ICompileModule assembly neutral interface.
I'd recommend loooking at:
the aop example in the repo
this nice writeup

Why use const (or Readonly)?

While I understand the function of these 2 keywords, I do not understand why do we use them.
I did a lot of research but most of my findings only talk about WHAT and WHEN to use const or readonly or the difference between each, but none of them explain WHY. Let's take the example below:
const decimal pi = 3.142
decimal circumference = 2 * pi * //r
as opposed to
decimal pi = 3.142
decimal circumference = 2 * pi * //r
The purpose of const/readonly is to prevent people from changing the value, but it is not like the user has the chance to change the value of decimal pi, so why bother using const (or readonly)?
Please note: My question is WHY do we use const/readonly, but NOT "what are const/readonly.
Additional info: I need to clarify this one more time. I don't think the question is under-researched. I clearly understand the functionality of each keywords, but I just don't know why do we even bother using them. Does it actually improve performance? Or it's just a "decorative" way to emphasize: Hey - please don't change me?
Compiler optimizations and to tell fellow Developers that they shouldn't be modified.
"Readonly" is an expression of your intention as a programmer, and a safeguard. It makes your life easier (and anyone who has to maintain your code in the future) if a read-only constraint can be enforced. For example, if you have a "readonly" member that is initialized in the constructor, you will never have to check it for a null reference.
"Const" is similar in that its value cannot be changed, but also quite different in that its value is applied at compile time. This makes it more memory-efficient, as no memory needs to be allocated for "const" values at runtime. Note however that, in contrast to "readonly", "const" only supports value types -- "const" reference types are not allowed.
There is one interesting implication of the difference between "readonly" and "const", when writing class libraries. If you use a "const", then any applications that use your library must be re-compiled if you distribute a new version of the library with a different value for the "const". By contrast, if you use a "readonly" member, then applications will pick up a modified value without needing to be re-compiled (as you can imagine, this would simplify your life if you had to distribute a patch or hotfix).
Its not for the user of your program. It is for other programmers. It makes it abundantly clear that this value should not be changed. Pi should never change. It may seem a bit silly in your small example but when projects span thousands of lines of code and get split into functions it can be different.
Also that value could get passed into a reference with a different name. How does the programmer know that it should not be changed any more? Perhaps he gets it with the keyword calculationValue he thinks will I wouldnt mind changing this to 50.0 for my uses. Next thing he knows he changed the value of pi for tons of other methods.
There are a few reasons. The first would be if the variable would be accessible by outside code, you wouldn't want someone else changing the definition of PI, also it makes it clear that this variable should never change, which does provide the ability for the compiler to make some optimizations. Then there's also the fact that it can prevent you from making a mistake in your own code and accidentally changing a constant value.
It's not only about the user but also about the developer I would say. Half a year and 20,000 lines of code later you - or anyone else working on the code - might have simply forgotten about this.
Plus, could be performance improvements when using constants I would assume
Two reasons:
Indicating to other developers that this is a value that should never change. It can help to distinguish between values like pi (which will always be 3.1415...), versus values that may some day be based on a configuration, or a user's input, or some other situational condition.
Along the same lines, you can help to prevent other developers doing something stupid like trying to assign a new value to the pi variable, because the compiler will yell at them. In a simple two-line method like this, that's less likely to be an issue, but as your code base grows more complex it can save people a lot of time to be prevented from doing things they're not supposed to do.
Allowing compilers to make optimizations. Both the initial compilation and the JIT compilation can take advantage of information about values that you know are not going to change. In the example you've given, the compiler will generate the equivalent of the following code when you use the const keyword:
decimal circumference = 6.284m * r;
Notice how the CPU doesn't need to multiple 2 * pi every time you call the method, because that's a value which is known at compile-time.

Alternative to inlining in C#?

I use the following statement to print the name of the current function (for example, an event handler) to the Output window in Visual Studio (2010):
Debug.Write(MethodBase.GetCurrentMethod().Name);
If I put this inside a utility function such as DisplayFunctionName(), instead of the parent function that calls it, what is displayed each time is "DisplayFunctionName" - no surprises there!
I know there is no inlining in C#, but is there another solution for this situation, short of using 'snippets', so as not to have to duplicate such statements?
You can use the CallerMemberNameAttribute to display the caller's name.
public void WriteDebug(string message, [CallerMemberName] string memberName = "")
{
Debug.Write("[" + memberName + "] " + message);
}
There is also CallerLineNumberAttribute and CallerFilePathAttribute which you can use to include this information for more diagnostics. These attributes are described in detail on MSDN. Combined with [Conditional("DEBUG")] on the method, you have the capability to provide a lot of information during debugging that is completely eliminated in a release build.
I know there is no inlining in C#, but is there another solution for this situation, short of using 'snippets', so as not to have to duplicate such statements?
Note that this really has nothing to do directly with "inlining", as much as with getting the calling member's information for diagnostics. That being said, the JIT definitely performs inlining when your code is run, which is one of the major reasons C# has decent performance.
This requires Visual Studio 2012, as it uses new compiler features in that release to function.
If you are using an older compiler, the other alternative is to use StackTrace to pull out the stack trace information. This has a fairly significant performance impact, however, so it's not something I'd use in a tight loop, at least not in a production environment, though its still functional for diagnostics during debugging.
string callingMethodName = (new StackTrace()).GetFrame(1).GetMethod().Name;

How will you use the C# 4 dynamic type?

C# 4 will contain a new dynamic keyword that will bring dynamic language features into C#.
How do you plan to use it in your own code, what pattern would you propose ? In which part of your current project will it make your code cleaner or simpler, or enable things you could simply not do (outside of the obvious interop with dynamic languages like IronRuby or IronPython)?
PS : Please if you don't like this C# 4 addition, avoid to bloat comments negatively.
Edit : refocussing the question.
The classic usages of dynamic are well known by most of stackoverflow C# users. What I want to know is if you think of specific new C# patterns where dynamic can be usefully leveraged without losing too much of C# spirit.
Wherever old-fashioned reflection is used now and code readability has been impaired. And, as you say, some Interop scenarios (I occasionally work with COM).
That's pretty much it. If dynamic usage can be avoided, it should be avoided. Compile time checking, performance, etc.
A few weeks ago, I remembered this article. When I first read it, I was frankly appalled. But what I hadn't realised is that I didn't know how to even use an operator on some unknown type. I started wondering what the generated code would be for something like this:
dynamic c = 10;
int b = c * c;
Using regular reflection, you can't use defined operators. It generated quite a bit of code, using some stuff from a Microsoft namespace. Let's just say the above code is a lot easier to read :) It's nice that it works, but it was also very slow: about 10,000 times slower than a regular multiplication (doh), and about 100 times slower than an ICalculator interface with a Multiply method.
Edit - generated code, for those interested:
if (<Test>o__SiteContainer0.<>p__Sitea == null)
<Test>o__SiteContainer0.<>p__Sitea =
CallSite<Func<CallSite, object, object, object>>.Create(
new CSharpBinaryOperationBinder(ExpressionType.Multiply,
false, false, new CSharpArgumentInfo[] {
new CSharpArgumentInfo(CSharpArgumentInfoFlags.None, null),
new CSharpArgumentInfo(CSharpArgumentInfoFlags.None, null) }));
b = <Test>o__SiteContainer0.<>p__Site9.Target(
<Test>o__SiteContainer0.<>p__Site9,
<Test>o__SiteContainer0.<>p__Sitea.Target(
<Test>o__SiteContainer0.<>p__Sitea, c, c));
The dynamic keyword is all about simplifying the code required for two scenarios:
C# to COM interop
C# to dynamic language (JavaScript, etc.) interop
While it could be used outside of those scenarios, it probably shouldn't be.
Recently I have blogged about dynamic types in C# 4.0 and among others I mentioned some of its potential uses as well as some of its pitfalls. The article itself is a bit too big to fit in here, but you can read it in full at this address.
As a summary, here are a few useful use cases (except the obvious one of interoping with COM libraries and dynamic languages like IronPython):
reading a random XML or JSON into a dynamic C# object. The .Net framework contains classes and attributes for easily deserializing XML and JSON documents into C# objects, but only if their structure is static. If they are dynamic and you need to discover their fields at runtime, they can could only be deserialized into dynamic objects. .Net does not offer this functionality by default, but it can be done by 3rd party tools like jsonfx or DynamicJson
return anonymous types from methods. Anonymous types have their scope constrained to the method where they are defined, but that can be overcome with the help of dynamic. Of course, this is a dangerous thing to do, since you will be exposing objects with a dynamic structure (with no compile time checking), but it might be useful in some cases. For example the following method reads only two columns from a DB table using Linq to SQL and returns the result:
public static List<dynamic> GetEmployees()
{
List<Employee> source = GenerateEmployeeCollection();
var queyResult = from employee in source
where employee.Age > 20
select new { employee.FirstName, employee.Age };
return queyResult.ToList<dynamic>();
}
create REST WCF services that returns dynamic data. That might be useful in the following scenario. Consider that you have a web method that returns user related data. However, your service exposes quite a lot of info about users and it will not be efficient to just return all of them all of the time. It would be better if you would be able to allow consumers to specify the fields that they actually need, like with the following URL
http://api.example.com/users?userId=xxxx&fields=firstName,lastName,age
The problem then comes from the fact that WCF will only return to clients responses made out of serialized objects. If the objects are static then there would be no way to return dynamic responses so dynamic types need to be used. There is however one last problem in here and that is that by default dynamic types are not serializable. In the article there is a code sample that shows how to overcome this (again, I am not posting it here because of its size).
In the end, you might notice that two of the use cases I mentioned require some workarounds or 3rd party tools. This makes me think that while the .Net team has added a very cool feature to the framework, they might have only added it with COM and dynamic languages interop in mind. That would be a shame because dynamic languages have some strong advantages and providing them on a platform that combines them with the strengths of strong typed languages would probably put .Net and C# ahead of the other development platforms.
Miguel de Icaza presented a very cool use case on his blog, here (source included):
dynamic d = new PInvoke ("libc");
d.printf ("I have been clicked %d times", times);
If it is possible to do this in a safe and reliable way, that would be awesome for native code interop.
This will also allow us to avoid having to use the visitor pattern in certain cases as multi-dispatch will now be possible
public class MySpecialFunctions
{
public void Execute(int x) {...}
public void Execute(string x) {...}
public void Execute(long x) {...}
}
dynamic x = getx();
var myFunc = new MySpecialFunctions();
myFunc.Execute(x);
...will call the best method match at runtime, instead of being worked out at compile time
I will use it to simplify my code which deals with COM/Interop where before I had to specify the member to invoke, its parameters etc. (basically where the compiler didn't know about the existence of a function and I needed to describe it at compile time). With dynamic this gets less cumbersome and the code gets leaner.

Overhead for not simplifying names in C#

I am working with a C# windows forms application (I doubt the project type affects answer but there it is anyways) and everything is going good. Code works fine. However Visual studio likes to tell me that Name can be simplified' when I do things like like usingthisin some methods where thethis` may not be needed. Here is an example:
public class xyz
{
string startdate;
string enddate;
private void calculateElapsedTime()
{
var endDate = Convert.ToDateTime(this.enddate).ToUniversalTime();
var startDate = Convert.ToDateTime(this.startdate).ToUniversalTime();
elapsedtime = (decimal)(endDate - startDate).TotalSeconds;
}
}
The names that can be simplified are this.startdateand this.enddate
The code runs fine without the this keyword but personally I like using the 'this' as for me it makes it more clear what is being done.
I tried running tests on memory usage and time if I go through and simplify all places where VS says I should and I ran the same test without simplifying names and got the same results.
So this has lead me to the question, Is there any actual performance hit for not simplifying names or is the hit so small that I just don't see the difference because my program isn't big enough or some third option?
EDIT
Since this is starting to get into a discussion on naming conventions figured I would add this. The above is a just an example of code that has a name that can be simplified not the actual way I write code. The name can be simplified message also would show up if you use the namespaceX.class.functionname in code already x namespace.
Is there any actual performance hit for not simplifying names or is the hit so small that I just don't see the difference because my program isn't big enough or some third option?
Not in the slightest. This is just a style choice that has no impact on the compiled code.
I would pick a different name for your local variables, though. Having the same name with just different casing on one letter makes it hard to distinguish between the local variable and the member name.
There will not be a difference in the performance or memory footprint of the application. Your C# code is translated into IL by the compiler, and that is the code that is executed - since the compiler understands both the version with this and without, the resulting IL will be identical in both cases, and as such so will the performance of the program.
You can like or not qualifying class member access with this, but since in C# is avoidable and one of most important premises in programming is keep as simple as possible, while it's just a coding style issue, it's still interesting that you understand that this in C# is used when you need to disambiguate an access to a local variable and a class member.
There's no performance hit and it's just about that you get used with C# coding style. For example, you use camel-casing on method identifiers while C# coding style says that they should be pascal-cased.
Like any other convention and guideline, it's just there to make your code more predictable to others rather than to yourself (and once you get used with these guidelines, they're also predictable for you too ;)).
BTW - the reason not to use this is because it makes you think that you dont need a naming convention for member variables.
Many people use a convention like
string _enddate
string endate_
string m_endate
that way you can tell by looking at the code that this is a member variable.
Using this.endate also says it is a member variable but the same code compiles if you just say enddate. Now you have code that compiles but you cannot tell at a glance if its a member or not

Categories

Resources