I am reading the book "CLR Via C#" and in the Generics chapter is said:
Source code protecton
The developer using a generic algorithm doesn't need to have access to the algorithm's source code. With C++ templates or Java's generics, however, the algorithm's source code must be available to the developer who is using the algorithm.
Can anyone explain what exactly is meant by this?
Well, Generic classes are distributed in compiled form, unlike C++, where templates need to be distributed in full source code. So you do not need to distribute the C# source code of a library that contains generic classes.
This does not prevent the Receiver of your class from disassembling it though (as it is compiled to IL which can be rather easily decompiled again). To really protect the code, additional methods, such as obfuscation are required.
Behind the scene: This distribution in compiled form is the reason why C# generics and C++ templates also differ in the way they need to be written. C# generic classes and their methods need to be fully defined at the time of compilation, and any error in the definition of the generic class or their methods or any operation on a template parameter which cannot be deduced at compile time will directly produce a compile error. In C++ the template is only compiled at the time of usage and only the methods actually used are compiled. If you have an undefined operation or even a syntactical error in a template definition, you will only see the error when that function is actually instantiated and used.
Related
For C#,, XAML transpiles to .cs (*.g.cs) files and need no IDL files.
Similarly in C++, Why can't XAML be transpiled to .cpp (*.g.cpp) files and Need any IDL files at all ?
I don't understand.
There's a fair bit of confusion in the question as to how the individual pieces fit together. The main driver for the translation is the IDL file. Irrespective of whether it is authored by a developer or synthesized by the IDE, it is the IDL that produces WINMD (Windows Metadata) files describing the interfaces and runtime classes in a language-agnostic fashion.
The WINMD's are used by all tooling that needs to look up types, query for members (such as properties, events, delegates), and produce application packages.
XAML, on the other hand, isn't part of the compilation process at all. While some of its contents are verified at compile time, it usually gets translated into a compact binary representation (XBF) that's loaded and parsed at runtime to instantiate types.
The question as to why IDL's are required with C++/WinRT (and not with C# or C++/CX) is easily answered: It's simply not possible to derive enough information from C++ class definitions to unambiguously deduce the required metadata.
As an easy example, consider properties. While both C# as well as C++/CX have dedicated language constructs to describe properties, this is not the case in C++. C++/WinRT projects properties to member functions that take zero or one argument (for getters and setters, respectively). If you were to automatically deduce metadata from a C++ class definition, a tool would have to disambiguate between a property and a method. There are other challenges, too, and while Kenny Kerr has repeatedly voiced the desire to get better IDE support for C++/WinRT, the Visual Studio team doesn't seem to care much (see this comment, for example).
For the foreseeable future you should be prepared to author IDL files if you choose to use C++/WinRT.
This question is related to How to detect static code dependencies in C# code in the presence of constants?
If type X depends on a constant defined in type Y, this dependency is not captured in the binary code, because the constant is inlined. Yet the dependency is there - try compiling X without Y and the compilation fails. So it is a compile time dependency, but not runtime.
I need to be able to discover such dependencies and scanning all the source code is prohibitively expensive. However, I have full control over the build and if there is a way to instruct the C# compiler not to inline constants - that is good enough for me.
Is there a way to compile C# code without inlining the constants?
EDIT 1
I would like to respond to all the comments so far:
I cannot modify the source code. This is not a toy project. I am analysing a big code base - millions of lines of C# code.
I am already using Roslyn API to examine the source code. However, I only do it when the binary code inspection (I use Mono.Cecil) of a method indicates the use of dynamic types. Analysing methods using dynamic with Roslyn is useful, because not all the dynamic usages are as bad as reflection. However, there is absolutely no way to figure out that a method uses a constant in general. Using Roslyn Analyser for that takes really long time, because of the code base size. Hence my "prohibitively expensive" statement.
I have an NDepend license and I used it at first. However, it only processes binary code. It does NOT see any dependencies introduced through constants. My analysis is better, because I drill down to dynamic users and employ Roslyn API to harvest as much as I can from such methods. NDepend does nothing of the kind. Moreover, it has bugs. For example, the latest version does not inspect generic method constraints and thus does not recognise any dependencies introduced by them.
Microsoft.CSharp is required to use dynamic feature.
I understand there are binders, evaluators and helpers in the assembly.
But why it has to be language-specific?
Why Microsoft.CSharp and not Microsoft.Dynamic or System.Dynamic?
Please, explain.
Let's say we have d.x where d is dynamic.
C# compiler
1. applies C# language rules
2. gets "property or field access"
3. emits (figurally) Binder.GetPropertyOrField(d, "x")
Now, being asked to reference Microsoft.CSharp may make one think that language-agnostic binder can't handle this case, and C#-only something got its way through compilation and requires special library.
Compiler had a bad day?
To your first question, it is language-specific because it needs to be.
In C# you call a method with too many arguments and you get an error. In Javascript, the extra arguments are simply ignored. In C# you access a member that doesn't exist and get an error, while in Javascript you get undefined. Even if you discovered all these varying feature sets and put it all into System.Core, the next language fad of the month is sure to have some super neat feature that it wouldn't support. It's better to be flexible.
There is common code in .NET core, under the System.Dynamic and System.Runtime.CompilerServices namespaces. It just can't all be common.
And as for your second question, the need for the "special C# library" could of course be removed by transforming these language-specific behaviors inline, but why? That will needlessly bloat your IL code size. It is the same reasoning for you not writing your own Int32.Parse every time you need to read in a number.
One reason I can think of - Visual Basic.NET has had late binding in it from day one, primarily oriented around how it interoperates with COM IDispatch interfaces - so if they wanted a language agnostic binder, they'd have had to adopt the Visual Basic rules - which includes that member lookup only works with Public members.
Apparently, the C# designers didn't want to be so strict. You can call this class' DoStuff method from C# via a dynamic reference:
public class Class1
{
internal void DoStuff()
{
Console.WriteLine("Hello");
}
}
Whereas attempting to call the same via Visual Basic's Object results in a MissingMemberException at runtime.
So because the C# designers weren't the first to arrive at the late-binding party, they could either follow Visual Basic's lead or they could say "each language will have its own rules" - they went with the latter.
Is there any built-in way to use Roslyn to perform the same compile-time transformations that the C# compiler does, e.g. for transforming iterators, initializers, lambdas, LINQ, etc. into basic C# code?
The Roslyn compiler API is designed to (in addition to translating source code to IL) let you build source code analysis and transformations tools.
However, lambdas and iterators do not have translations that can always be specified using source. They are modeled using the internal bound node abstraction that includes additional compiler specific rules that can only be represented using IL.
It would be possible to translated LINQ to source in C#, since it is specified as a source code translation (whether the compiler actually does it that way or not.) Yet, there is no compiler API that does this specifically. If there was, it would probably show up as a services layer API and not a compiler API.
AFAIK, no, there is no such thing exposed in Roslyn. But the compiler has to do these transformations somehow, so it's possible you will be able to do this by accessing some internal method.
Of course, you could use Roslyn to make these transformations yourself, but that's not what you're asking.
If you want to use a COM type in your C# code, the process is straight forward, right? You just need to use the type library importer and that's fine, but what if you don't have one and you can't take a look at the IDL file? You just have the COM DLL server.
As an example, try using the IActiveDesktop interface.
What's the approch used to solve this kind of problem?
There are two kinds of COM interfaces. The one you are familiar with are the ones that restrict themselves to a subset of the COM spec known as "OLE Automation". Also known as ActiveX before that term became associated with security disasters.
Automation compatible interfaces are easy to use from just about any language. They typically inherit from IDispatch, allowing them to be used from scripting languages. And limit themselves to using only automation compatible types for their method arguments. The simple stuff, comparable to the .NET value types, BSTR for strings, SAFEARRAY for arrays, VARIANT for untyped arguments, quite similar to .NET's System.Object.
Another feature they support well is type libraries, the equivalent of .NET metadata. Used by a compiler to know how to call the interface methods. The IDE uses a type library to automatically generate the interop library so you can directly create the wrapper class and call the methods from .NET code.
Well, that's the good news. The bad news is that there are lots of COM interfaces around that do not use the Automation restrictions. They typically inherit from IUnknown and use function arguments that don't marshal well. Like structures. One very large and visible component in Windows that is like this is the shell. Windows Explorer.
That's where IActiveDesktop fits in as well, it is a shell interface and inherits from IUnknown. It is declared in the ShlObj.h SDK header file, there is not even a IDL file for it. And consequently no way to get a type library with its definition. It uses incompatible argument types, like LPCWSTR (a raw pointer to a string) instead of BSTR. And structure pointers like LPCCOMPONENT and LPWALLPAPEROPT. The CLR interop support is powerless to marshal that properly.
Using the interface in C# is technically not impossible, but you have to redeclare the interface. Very carefully, getting it wrong is very easy to do. The fact that source code that already does this is very hard to find is a hint how difficult it is. This squarely falls in the 'not impossible, but what sane programmer wants to maintain code like this' category. The shell is the domain of unmanaged C++ code. And a crew of hardy programmers, because debugging shell extensions is quite painful.