Disable generics in c# for Xamarin - c#

I have a C# app that gets ahead of time compiled to the native iOS code using monotouch (Xamarin)
Some of the libraries I link in use generics. However, it turns out that this method of compiling causes significant code bloat because it uses the C++ style of template code generation generating functions for List<int>, List<string> etc.
What I want is the Java style of generics where generics are used for compile time checking but at runtime the code only contains functions for List and not for each of the templated types.
Note: This is not an issue with using C# in the .Net CLR, as explained here. The issue arises because code is compiled AOT to the native binary instead of intermediate language. Moreover runtime type checking for generic methods is fairly useless since the binary is native.
Question: How do I disable generics, i.e. replace all occurrences of List<T> with List, during compilation? Is this even possible?

This is not possible.
On Java it's possible because they don't have value types, only classes (you can emulate this behavior by not using value types yourself, only use List<object> (or an object subclass), in which case the AOT compiler will only generate one instantiation of List).
You're also not entirely correct saying that it's not an issue with the .NET CLR; the difference between the .NET CLR and Xamarin's AOT compiler is that the AOT compiler can't wait until execution time to determine if a particular instantiation is needed or not (because iOS doesn't allow executable code to generated on the device), it needs to make sure every possible instantiation is available. If your app on the .NET CLR happened to need every possible generic instantiation at runtime, then you'd have a similar problem (only it would show up as runtime memory usage, not executable size, and on a desktop that's usually not a problem anyway).
The supported way of solving your problem is to enable the managed linker for all assemblies (in the project's iOS Build options, set "Linker behavior" to "All assemblies"). This will remove all the managed code from your app you're not using, which will in most cases significantly reduce the app size.
You can find more information about the managed linker here: http://developer.xamarin.com/guides/ios/advanced_topics/linker/
If your app is still too big, please file a bug (http://bugzilla.xamarin.com) attaching your project, and we'll have a look and see if we can improve the AOT compiler somehow (for instance we already optimize value types with the same size, so List<int> and List<uint> generate only one instantiation).

Related

Do C# optimizers perform copy elision?

I'm looking to optimize some C# code and I'm curious if my optimizer is going to perform any copy elision, and if so which kinds. I haven't been able to find any information on this by googling. If not I may want to switch my methods to take struct arguments by reference. The environment I care about is Unity 3D, which is odd as it uses either the Mono 2.0 Runtime or Unity's IL2CPP transpiler (which I should probably ask them about directly as it's closed-source.) But it would be interesting to know for Microsoft's optimizer as well, and if this type of optimization is generally allowed by the standard.
Side note: If this is not supported by the optimizer, it would be awfully nice if I could get the equivalent of the C++ const ref, but it appears this doesn't exist.
I can speak for IL2CPP, and say that it does not do anything to take struct arguments by reference. Even without access to the IL2CPP source code, you can see this by inspecting the generated C++ code.
Note that a C# struct is represented by a C++ struct, and that C++ struct is passed by value. We've discussed the possibility of using const references in this case, but we've not implemented yet (and we may not ever).

What is the difference between runtime and compile-time? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
So what is a runtime? Is it a virtual machine that executes half-compiled code that cannot run on a specific processor. If so, then what's a virtual machine? Is it another software that further translates the half-compiled code to machine specific code? So what if we are talking about one of those languages that don't compile to intermediate code but rather translate/compile directly to machine code. What's a runtime in that situation? is it the hardware (CPU and RAM)?
Also, what's the difference between compile-time and runtime? Are they stages of a software lifecycle. I mean a program is originally a bunch of text files, right? So you compile or translate those to a form of data that then either can be loaded to memory and executed by the processor or if it's a "managed" language, then it would need further compilation before it can run on hardware.
What exactly is a managed language?
Lastly, is there such a thing as debug-time and what is it?
I'm in my first term studying computer science, and it really confuses me how illogically things are taught. "Information" is being shoved down my throat, but whenever I try to make sense out of everything by organizing everything related into a single system of well defined components and relations, I get stuck.
Thanks in advance,
Garrett
The kind of code suitable for reasoning by human beings (let's call it "source code") needs to pass through several stages of translation before it can be physically executed by the underlying hardware (such as CPU or GPU):
Source code.
[Optionally] intermediate code (such as .NET MSIL or Java bytecode).
Machine code conformant to the target instruction set architecture.
The microcode that actually flips the logical gates in silicon.
These translations can be done in various phases of the program's "lifecycle". For example, a particular programming language or tool might choose to translate from 1 to 2 when the developer "builds" the program and translate from 2 to 3 when the user "runs" it (which is typically done by a piece of software called "virtual machine"1 that needs to be pre-installed on user's computer). This scenario is typical for "managed" languages such as C# and Java.
Or it could translate from 1 to 3 directly at build time, as common for "native" languages such as C and C++.
The translation between 3 and 4 is almost always done by the underlying hardware. It's technically a part of the "run time" but is typically abstracted away and largely invisible to the developer.
The term "compile time" typically denotes the translation from 1 to 2 (or 3). There are certain checks that can be done at compile time before the program is actually run, such as making sure the types of arguments passed to a method match the declared types of method parameters (assuming the language is "statically typed"). The earlier the error is caught, the easier it is to fix, but this has to be balanced with the flexibility, which is why some "scripting" languages lack comprehensive compile-time checks.
The term "run-time" typically denotes the translation from 2 (or 3) all the way down to 4. It is even possible to translate directly from 1 at run-time, as done by so called "interpreted languages".
There are certain kinds of problems that can't be caught at compile time, and you'll have to use appropriate debugging techniques (such debuggers, logging, profilers etc...) to identify them at run-time. The typical example of a run-time error is when you try to access an element of a collection that is not there, which could then manifest at run-time as an exception and is a consequence of the flow of execution too complex for the compiler to "predict" at compile time.
The "debug time" is simply a run-time while the debugger is attached to the running program (or you are monitoring the debug log etc.).
1 Don't confuse this with virtual machines that are designed to run native code, such as VMware or Oracle VirtualBox.
Compile-time and run-time usually refers to when checks occur or when errors can happen. For example, in a statically typed language like C# the static type checks are made at compile time. That means that you cannot compile the application if you for example try to assign a string to an int-variable. Run-time on the other hand refers to the time when the code is actually executed. For example exceptions are always thrown at run-time.
As for virtual machines and such; C# is a language that compiles into the Common Intermediate Language (CIL, or IL). The result is a code which is the same regardless of which .NET language you use (C# and VB.NET both produce IL). The .NET Framework then executes this language at run-time using Just-in-time compilation. So yeah, you can see the .NET Framework as a virtual machine that runs a special sublanguage against the target machine code.
As for debug-time, I don’t think there is such a thing, as you are still running the program when debugging. So if anything, debug-time would be run-time with an attached debugger. But you wouldn’t use a term like that.
Compile-time - The period at which a compiler will attempt to compile some code. Example: "The compiler found 3 type errors at compile-time which prevented the program from being compiled."
Runtime - The period during which a program is executing. Example: "We did not spot the error until runtime because it was a logic error."
Run-time and virtual machines are two separate ideas - your first question doesn't make sense to me.
Virtual Machines are indeed software programs that translate "object" [Java, C#, etc.] code into byte-code that can be run on a machine. If a language uses a virtual machine, it also often uses Just In Time compiling - which means that compile-time and run-time are in essence happening at the same time.
Conversely, compiled languages like C, C++ are usually compiled into byte-code before being executed on a machine and therefore compile-time and run-time are completely separate.
Generally "managed" languages have garbage collection (you don't directly manipulate memory with allocations and de-allocations [Java and C# are both examples]) and run on some type of virtual machine.

If C# is not interpreted, then why is a VM needed?

I have read a lot of controversy about C#, where some say it's interpreted, some say it's not. I do know it's compiled into the MSIL and then JITed when run, depending on the processor etc...but isn't it still interpreted in the way it needs a VM (.NET) to run?
The VM is an abstraction of a microprocessor. It is just a definition and does not really exist. I.e. you cannot run code on the VM; however, you can generate IL code for it. The advantage is that language compilers do not need to know details about different kinds of real processors. Since different .NET languages like C# or VB (and many more) produce IL, they are compatible on this level. This, together with other conventions like a common type system, allows you to use a DLL generated from VB code in a C# program, for instance.
The IL is compiled just in time on Windows when you run a .NET application and can also be compiled ahead of time in Mono. In both cases, native machine code for the actual processor is generated. This fully compiled code is executed on the REAL microprocessor!
A different aspect is the number of compliers you have to write. If you have n languages and you want to run them on m processor architectures, you need n language-to-IL compliers + m IL-to-native-code compliers. Without this intermediate abstraction layer you would need to have n × m compliers and this can be a much higher number than just n + m!
The short answer is no, the requirement for the VM does not indicate that it's interpreted.
The VM contains the JIT compiler that translates IL to native machine code. It also contains the .NET class library, upon which C# programs depend. It also contains some other mechanisms involved in dynamic linking and such (definitely built on top of Windows DLL mechanism, but .NET has features above and beyond what Windows provides on its own, which are implemented in the VM).
You are probably refering to CLR (an implementation of the specification CLI).
The CLI defines a specific type system, semantics of all the operations on these types, a memory model, and run-time metadata.
In order to provide all of the above, some instrumentation of the generated code must happen. One simple example is to ensure that larger-than-32-bit numerals are supported and that floating-point operations behave as per specification on every architecture.
In addition, to ensure correct behaviour of memory allocation, correct management of metadata, static initialisation, generic type instantiation and similar some additional processes must be present during the execution of your CLR code. This is all taken care of by the VM and is not readily provided by the CPU.
A quote from Wikipedia, for example:
The CLR provides additional services including memory management, type safety and exception handling.

Help me understand Resharper background compilation

So Jeff Atwood rightly complained about Visual Studio not performing background compilation see: http://www.codinghorror.com/blog/2007/05/c-and-the-compilation-tax.html
The solution from most sources seems to be Reshaper which will incrementally perform background compilation as you write. This leads to their great realtime re-factoring tips and error detection.
But what I don't understand is with R# continually compiling my code, why does it take so long when executing a compilation via VS (i.e. Ctrl + Shift + B or similar). What I mean by this is, if R# has already compiled my code then why would I need a recompilation?
My assumption is of course that R# is not overriding the assemblies in my bin directories but instead holding the compilation results in memory. In which case, is it possible to tell R# to simply override my assemblies when compilation is successful?
I don't know about "rightly complained" - that's an opinion I happen to disagree with:)
However, the VB.NET (and probably Resharper c#) background compilers do not actually compile full assemblies - they cannot! If you think about it, the natural state of your code while you are working is not compilable! Almost every keystroke puts your code in an invalid state. Think of this line:
var x = new Something();
As you type this, from the key "v" to the key ")", your code is "wrong". Or what if you are referencing a method you haven't defined yet? And if this code is in an assembly that another assembly requires, how would you compile that second assembly at all, background or not?
The background compilers get around this by compiling small chunks of your code into multiple transient "assemblies" that are actually just metadata holders - really, they don't care about the actual effects of the code as much as the symbols defined, used, etc. When you finally hit build, the actual full assemblies still need to be built.
So no, I don't believe it's possible because they're not built to do actual full compilation - they are built to check your code and interpret symbols on the fly.
Reshaper which will incrementally perform background compilation as you write
It doesn't, it just parses the source code. The exact same thing Visual Studio already does if you don't have Resharper, that's how it implements IntelliSense, its own refactoring features and commands like GoTo Definition and Find All References. Visual Studio also parses in the background, updating its data while you type. Resharper just implements more bells and whistles with that parsing data.
Going from parsing the code to actually generating the assembly is a pretty major step. The internal format of an assembly is too convoluted to allow this to happen in the background without affecting the responsiveness of the machine.
And the C# compiler is still a large chunk of unmanaged C++ code that is independent from the IDE. An inevitable consequence of having to have the compiler first. It is however a stated goal for the next version of C# to provide compile-on-demand services. Getting true background compilation is a possibility.
I don't really have an answer but I just wanted to say that I have been using Eclipse and Java for 4 months now and I love the automatic compilation. I have a very large java code base and compilation happens constantly as I save code changes. When I hit Run everything is ready to go! It's just awesome. It also deploys to the local web server instance (Tomcat in my case) automatically as I make code changes. All this is setup by default in Eclipse.
I hope Microsoft does something similar with .net in the near future.

What makes the Java compiler so fast?

I was wondering about what makes the primary Java compiler (javac by sun) so fast at compilation?
..as well as the C# .NET compiler from Microsoft.
I am comparing them with C++ compilers (such as G++), so maybe my question should have been, what makes C++ compilers so slow :)
That question was nicely answered in this one: Why does C++ compilation take so long? (as jalf pointed out in the comments section)
Basically it's the missing modules concept of C++, and the aggressive optimization done by the compiler.
I think the most difficult part is not the need to compile the header files (unless they are really big, but you can use precompiled headers in that case). The worst part is always the fact that C++'s grammar is too wildly context-sensitive. Despite the fact I like C++, I feel sorry for anybody who has to write a C++ parser.
There are a couple of things that make the C++ compiler slower than those of Java/C#. The grammar is much more complex, generic programming support is much more powerfull in C++, but at the same time it is more expensive to compile. Inclusion of files work in a different way than importing modules.
Inclussion of header files
First, whenever you include a file in C++ the contents of the file (.h usually) are injected in the current compilation unit (include guards avoid reinjecting the same header twice), and this is transitive. That is, if you include header a.h, that in turns includes b.h, your compilation unit will include all code in a.h and all code in b.h.
Java (or C#, I will talk about Java, but they are similar in this) don't have include files, they depend on the binaries from the compilation of the used classes. This means that whenever you compile a.java that uses an object B defined in b.java, it just checks the binary b.class, it does not need to go deeper to check the dependencies of B, so it can cut the process earlier (with just one level of checking).
At the same time, including files only includes the language definitions, and processing it requires time. When the Java/C# compiler reads a binary it has the same information but already processed by the compilation step that generated it.
So at the end, in C/C++ more files are included and at the same time processing of those includes is more expensive than processing of binary modules.
Templates
Templates are special in their own way. They can be precompiled, but they are usually not (for a good set of reasons). This means that in all compilation units that use std::vector the whole set of vector methods used (unused template methods don't get compiled) is processed and the binary code generated by the compiler. At a later step, during linking, redundant definitions of the same method will get dropped, but during compilation they must be processed.
Support in Java for generics is more limited in many ways. At the end, for example, there is only one Vector class binary, and whenever the compiler sees Vector in java what it does is generating type checking code before delegating to the real Vector implementation (that stores plain Object) and that is not generic. The compiler does provide the type warranties, but does not compile Vector for each type.
In C# it is, once again, different. C# support for generics is more complex than that of Java, and at the end generic classes are different than plain classes but anyway they get compiled only once as the binary format has all required information.
Because they do something quite different, C++ compiler produces optimized native code whereas C#, VB .Net and Java compiler produce an intermidiate language than when you first execute the application is turned into native code, and that is why you get slow loading of application in Java etc. the first time you execute the application.
The C++ compiler has to do the full optimization where the JITed languages optimize when you execute the application.
Someone would argue that you have to measure C++ compile time = Java compile time + time for JITing the first time you load the application if you want to be correct, but i don't think that would be right and fair because you are comparing native languages to JITed, or else oranges to apples.
The C++ compiler must repeatedly compile all the header files and there are lots of them, so this is one thing that slows it down.
One of the more time consuming tasks when compiling is code optimization.
Javac does very little optimization on the code when doing the compilation. Optimization is instead done by the JVM when running the application.
A C/C++ needs to be optimized when compiling since optimization of compiled machine code is hard.
You got it right in your last sentence: it's not java or C# that's fast to compile, it's C++ that is exceptionally slow to compile, due to its complex grammar and features, most importantly templates
If you think javac is fast try Jikes.... (see http://jikes.sourceforge.net/)
It is a Java Compiler written in C++. Unfortunately they haven't kept up with the latest Java Compiler specs but if you want to see fast this is it.
Tony
I think part of it is the complexity of the languages. C++ is incredibly mutable, with the ability to override pretty much any operator or piece of syntax (like overriding the () operator). This means the compiler has to do a lot more work just to determine what operations to actually run, even for simple things. Java and C# don't have this issue, as the syntax is fixed, and they're generally much simpler to parse.
It's a bit difficult comparing bytecode languages like java with natively compiled languages like C++. A better comparison is Delphi vs C++, where Delphi is much faster to compile. Since this has nothing to do with optimization or byte code, it must be due to differences in language syntax and the relative performance of includes vs. modules/units.
Is Java compiler fast?
The Java to class translation shall be blindingly fast since it is just a glorified zip with some syntax checking so to be fair if compared to a real compiler that is doing optimization and object code generation the "translation" from Java to class is trivial.
Did a comparison with fairly small program "hello world" and-and compare to GCC (C/C++/Ada) and found that javac was 30 times slower, and it got even worse in runtime?

Categories

Resources