Will the compiler only compile code that can get executed? - c#

I have a class library and am using only part of it. Is there a need to delete what isn't being used in order to shrink the size of the created code (in release configuration)?
As far as I've seen, the compiler takes care of that, and removing the code doesn't change the EXE file size. Will this always be true? Removing all unneeded code would take very long, so I want to know if there's need for that.
More information: there are methods and classes in the class library that aren't called from the executing code, but are referenced by other parts of code in the class library (which themselves are never called).

No, the compiler includes the "dead" code as well. A simple reason for this is that it's not always possible to know exactly what code will and won't be executed. For example, even a private method that is never referenced could be called via reflection, and public methods could be referenced by external assemblies.
You can use a tool to help you find and remove unused methods (including ones only called by other unused methods). Try What tools and techniques do you use to find dead code? and Find unused code to get you started.

It all gets compiled. Regardless of whether it is called or not. The code may be called by an external library.
The only way to make the compiler ignore code is by using Compiler Preprocessor Directives. More about those here.

I doubt the compiler will remove anything. The fact is, the compiler can't tell what is used and what is not, as types can be instantiated and methods called by name, thanks to reflection.

Let's suppose there is a class library called Utility. You created a new project and added this class library to that project. Even if your EXE calls only 1-2 methods from the class library, it's never a good idea to delete the unreferenced code.
It would go against the principle of reusablity. Despite the fact that there would be some classes present in the library unreferenced from the EXE, it would not have any bad impact on performance or size of the program.

Determining all and only dead code is (if one makes the idealization that one has a "math world" like language) recursively undecidable, in most languages. (A few rare ones like the Blaise language are decidable.)

to the question of whether there is a "need to delete what isn't being used in order to shrink the size of the created code": i think this would only be useful to save network bandwidth. removing unused code is crucial in web applications to improve loading speeds etc.
if you're code is an exe or a library, the only reason i see to remove dead code, is to improve your code quality. so that someone looking at your code 2 years down the line won't scratch their heads wondering what it does.

Related

C# Attribute to detect unused methods

Is it possible to write an Attribute that can track methods to detect if those methods are never called?
[Track]
void MyMethod(){
}
output:
warning: method "MyMethod" in "MyClass" has no references in code.
It is not strictly necessary to have it run at compile time, but it should work when the application is initialized (better at compile time anyway).
This tag will be putted to track methods on Audio Library, since audio is refactored very frequently and we usually search for audio methods with 0 references in code we want to mark these methods so we can detect quickly and remove unused audio assets.
Basically each time we add a new sound effect, we may later no longer trigger it (calling its method), and the audio file/playback code can remain in the application for a long time.
Maybe this is the answer you're looking for?
Finding all references to a method with Roslyn
you can use the code there to automate something of your own with Reflection I'd say
A partial answer is found here:
C# reflection and finding all references
I can use that info to get references to methods marked with a particular attribute, however that is a run-time script (But better than nothing).
A little late, but better than never: I use WinGrep by Gnu to search all folders and files for the name of the method:
`C:\>grep -irw "method_name" * --include=*.cs --include=*.sql --include=*.txt`
You can include as many, or as few, file name extensions as makes sense for you. In the example above, I show the top directory as C:, but you can start the search at any directory that makes sense.
The huge advantage of using grep over IDE based searches is that it will search across multiple projects and solutions.

Is there an equivalent to Java's ClassFileTransformer in .NET? (a way to replace a class)

I've been searching for this for quite a while with no luck so far. Is there an equivalent to Java's ClassFileTransformer in .NET? Basically, I want to create a class CustomClassFileTransformer (which in Java would implement the interface ClassFileTransformer) that gets called whenever a class is loaded, and is allowed to tweak it and replace it with the tweaked version.
I know there are frameworks that do similar things, but I was looking for something more straightforward, like implementing my own ClassFileTransformer. Is it possible?
EDIT #1. More details about why I need this:
Basically, I have a C# application and I need to monitor the instructions it wants to run in order to detect read or write operations to fields (operations Ldfld and Stfld) and insert some instructions before the read/write takes place.
I know how to do this (except for the part where I need to be invoked to replace the class): for every method whose code I want to monitor, I must:
Get the method's MethodBody using MethodBase.GetMethodBody()
Transform it to byte array with MethodBody.GetILAsByteArray(). The byte[] it returns contains the bytecode.
Analyse the bytecode as explained here, possibly inserting new instructions or deleting/modifying existing ones by changing the contents of the array.
Create a new method and use the new bytecode to create its body, with MethodBuilder.CreateMethodBody(byte[] il, int count), where il is the array with the bytecode.
I put all these tweaked methods in a new class and use the new class to replace the one that was originally going to be loaded.
An alternative to replacing classes would be somehow getting notified whenever a method is invoked. Then I'd replace the call to that method with a call to my own tweaked method, which I would tweak only the first time is invoked and then I'd put it in a dictionary for future uses, to reduce overhead (for future calls I'll just look up the method and invoke it; I won't need to analyse the bytecode again). I'm currently investigating ways to do this and LinFu looks pretty interesting, but if there was something like a ClassFileTransformer it would be much simpler: I just rewrite the class, replace it, and let the code run without monitoring anything.
An additional note: the classes may be sealed. I want to be able to replace any kind of class, I cannot impose restrictions on their attributes.
EDIT #2. Why I need to do this at runtime.
I need to monitor everything that is going on so that I can detect every access to data. This applies to the code of library classes as well. However, I cannot know in advance which classes are going to be used, and even if I knew every possible class that may get loaded it would be a huge performance hit to tweak all of them instead of waiting to see whether they actually get invoked or not.
POSSIBLE (BUT PRETTY HARDCORE) SOLUTION. In case anyone is interested (and I see the question has been faved, so I guess someone is), this is what I'm looking at right now. Basically I'd have to implement the profiling API and I'll register for the events that I'm interested in, in my case whenever a JIT compilation starts. An extract of the blogpost:
In your ICorProfilerCallback2::ModuleLoadFinished callback, you call ICorProfilerInfo2::GetModuleMetadata to get a pointer to a metadata interface on that module.
QI for the metadata interface you want. Search MSDN for "IMetaDataImport", and grope through the table of contents to find topics on the metadata interfaces.
Once you're in metadata-land, you have access to all the types in the module, including their fields and function prototypes. You may need to parse metadata signatures and this signature parser may be of use to you.
In your ICorProfilerCallback2::JITCompilationStarted callback, you may use ICorProfilerInfo2::GetILFunctionBody to inspect the original IL, and ICorProfilerInfo2::GetILFunctionBodyAllocator and then ICorProfilerInfo2::SetILFunctionBody to replace that IL with your own.
The great news: I get notified when a JIT compilation starts and I can replace the bytecode right there, without having to worry about replacing the class, etc. The not-so-great news: you cannot invoke managed code from the API's callback methods, which makes sense but means I'm on my own parsing the IL code, etc, as opposed to be able to use Cecil, which would've been a breeze.
I don't think there's a simpler way to do this without using AOP frameworks (such as PostSharp). If anyone has any other idea please let me know. I'm not marking the question as answered yet.
I don't know of a direct equivalent in .NET for this.
However, there are some ways to implement similar functionality, such as using Reflection.Emit to generate assemblies and types on demand, uing RealProxy to create proxy objects for interfaces and MarshalByRefObject objects. However, to advise what to use, it would be important to know more about the actual use case.
After quite some research I'm answering my own question: there isn't an equivalent to the ClassFileTransformer in .NET, or any straightforward way to replace classes.
It's possible to gain control over the class-loading process by hosting the CLR, but this is pretty low-level, you have to be careful with it, and it's not possible in every scenario. For example if you're running on a server you may not have the rights to host the CLR. Also if you're running an ASP.NET application you cannot do this because ASP.NET already provides a CLR host.
It's a pity .NET doesn't support this; it would be so easy for them to do this, they just have to notify you before a class is loaded and give you the chance to modify the class before passing it on the CLR to load it.

Checking for the existence a reference/type at compile time in .NET

I've recently found the need to check at compile-time whether either: a) a certain assembly reference exists and can be successfully resolved, or b) a certain class (whose fully qualified name is known) is defined. These two situations are equivalent for my purposes, so being able to check for one of them would be good enough. Is there any way to do this in .NET/C#? Preprocessor directives initially struck me as something that might help, but it seems it doesn't have the necessary capability.
Of course, checking for the existence of a type at runtime can be done easily enough, but unfortunately that won't resolve my particular problem in this situation. (I need to be able to ignore the fact that a certain reference is missing and thus fall-back to another approach in code.)
Is there a reason you can't add a reference and then use a typeof expression on a type from the assembly to verify it's available?
var x = typeof(SomeTypeInSomeAssembly);
If the assembly containing SomeTypeInSomeAssembly is not referenced and available this will not compile.
It sounds like you want the compiler to ignore one branch of code, which is really only doable by hiding it behind an #if block. Would defining a compiler constant and using #if work for your purposes?
#if MyConstant
.... code here that uses the type ....
#else
.... workaround code ....
#endif
Another option would be to not depend on the other class at compile-time at all, and use reflection or the .NET 4.0 dynamic keyword to use it. If it'll be called repeatedly in a perf-critical scenario in .NET 3.5 or earlier, you could use DynamicMethod to build your code on first use instead of using reflection every time.
I seem to have found a solution here, albeit not precisely for what I was initially hoping.
My Solution:
What I ended up doing is creating a new build configuration and then defining a precompiler constant, which I used in code to determine whether to use the reference, or to fall back to the alternative (guaranteed to work) approach. It's not fully automatic, but it's relatively simple and seems quite elegant - good enough for my purposes.
Alternative:
If you wanted to fully automate this, it could be done using a pre-build command that runs a Batch script/small program to check the availabilty of a given reference on the machine and then updates a file containing precompiler constants. This however I considered more effort than it was worth, though it may have been more useful if I had multiple independent references that I need to resolve (check availability).

Why remove unused using directives in C#?

I'm wondering if there are any reasons (apart from tidying up source code) why developers use the "Remove Unused Usings" feature in Visual Studio 2008?
There are a few reasons you'd want to take them out.
It's pointless. They add no value.
It's confusing. What is being used from that namespace?
If you don't, then you'll gradually accumulate pointless using statements as your code changes over time.
Static analysis is slower.
Code compilation is slower.
On the other hand, there aren't many reasons to leave them in. I suppose you save yourself the effort of having to delete them. But if you're that lazy, you've got bigger problems!
I would say quite the contrary - it's extremely helpful to remove unneeded, unnecessary using statements.
Imagine you have to go back to your code in 3, 6, 9 months - or someone else has to take over your code and maintain it.
If you have a huge long laundry list of using statement that aren't really needed, looking at the code could be quite confusing. Why is that using in there, if nothing is used from that namespace??
I guess in terms of long-term maintainability in a professional environment, I'd strongly suggest to keep your code as clean as possible - and that includes dumping unnecessary stuff from it. Less clutter equals less confusion and thus higher maintainability.
Marc
In addition to the reasons already given, it prevents unnecessary naming conflicts. Consider this file:
using System.IO;
using System.Windows.Shapes;
namespace LicenseTester
{
public static class Example
{
private static string temporaryPath = Path.GetTempFileName();
}
}
This code doesn't compile because both the namespaces System.IO and System.Windows.Shapes each contain a class called Path. We could fix it by using the full class path,
private static string temporaryPath = System.IO.Path.GetTempFileName();
or we could simply remove the line using System.Windows.Shapes;.
This seems to me to be a very sensible question, which is being treated in quite a flippant way by the people responding.
I'd say that any change to source code needs to be justified. These changes can have hidden costs, and the person posing the question wanted to be made aware of this. They didn't ask to be called "lazy", as one person inimated.
I have just started using ReSharper, and it is starting to give warnings and style hints on the project I am responsible for. Amongst them is the removal of redundant using directive, but also redundant qualifiers, capitalisation and many more. My gut instinct is to tidy the code and resolve all hints, but my business head warns me against unjustified changes.
We use an automated build process, and therefore any change to our SVN repository would generate changes that we couldn't link to projects/bugs/issues, and would trigger automated builds and releases which delivered no functional change to previous versions.
If we look at the removal of redundant qualifiers, this could possibly cause confusion to developers as classes in our Domain and Data layers are only differentiated by the qualifiers.
If I look at the proper use of capitalisation of anachronyms (i.e. ABCD -> Abcd), then I have to take into account that ReSharper doesn't refactor any of the Xml files we use that reference class names.
So, following these hints is not as straight-forward as it appears, and should be treated with respect.
Less options in the IntelliSense popup (particularly if the namespaces contain lots of Extension methods).
Theoretically IntelliSense should be faster too.
Remove them. Less code to look at and wonder about saves time and confusion. I wish more people would KEEP THINGS SIMPLE, NEAT and TIDY.
It's like having dirty shirts and pants in your room. It's ugly and you have to wonder why it's there.
Code compiles quicker.
Recently I got another reason why deleting unused imports is quite helpful and important.
Imagine you have two assemblies, where one references the other (for now let´s call the first one A and the referenced B). Now when you have code in A that depends on B everything is fine. However at some stage in your development-process you notice that you actually don´t need that code any more but you leave the using-statement where it was. Now you not only have a meaningless using-directive but also an assembly-reference to B which is not used anywhere but in the obsolete directive. This firstly increases the amount of time needed for compiling A, as B has to be loaded also.
So this is not only an issue on cleaner and easier to read code but also on maintaining assembly-references in production-code where not all of those referenced assemblies even exist.
Finally in our exapmle we had to ship B and A together, although B is not used anywhere in A but in the using-section. This will massively affect the runtime-performance of A when loading the assembly.
It also helps prevent false circular dependencies, assuming you are also able to remove some dll/project references from your project after removing the unused usings.
At least in theory, if you were given a C# .cs file (or any single program source code file), you should be able to look at the code and create an environment that simulates everything it needs. With some compiling/parsing technique, you may even create a tool to do it automatically. If this is done by you at least in mind, you can ensure you understand everything that code file says.
Now consider, if you were given a .cs file with 1000 using directives which only 10 was actually used. Whenever you look at a symbol that is newly introduced in the code that references the outside world, you will have to go through those 1000 lines to figure out what it is. This obviously slows down the above procedure. So if you can reduce them to 10, it will help!
In my opinion, the C# using directive is very very weak, since you cannot specify single generic symbol without genericity being lost, and you cannot use using alias directive to use extension methods. This is not the case in other languages like Java, Python and Haskell, in those languages you are able to specify (almost) exactly what you want from the outside world. But even then, I will suggest to use using alias whenever possible.

Why should you remove unnecessary C# using directives?

For example, I rarely need:
using System.Text;
but it's always there by default. I assume the application will use more memory if your code contains unnecessary using directives. But is there anything else I should be aware of?
Also, does it make any difference whatsoever if the same using directive is used in only one file vs. most/all files?
Edit: Note that this question is not about the unrelated concept called a using statement, designed to help one manage resources by ensuring that when an object goes out of scope, its IDisposable.Dispose method is called. See Uses of "using" in C#.
There are few reasons for removing unused using(s)/namespaces, besides coding preference:
removing the unused using clauses in a project, can make the compilation faster because the compiler has fewer namespaces to look-up types to resolve. (this is especially true for C# 3.0 because of extension methods, where the compiler must search all namespaces for extension methods for possible better matches, generic type inference and lambda expressions involving generic types)
can potentially help to avoid name collision in future builds when new types are added to the unused namespaces that have the same name as some types in the used namespaces.
will reduce the number of items in the editor auto completion list when coding, posibly leading to faster typing (in C# 3.0 this can also reduce the list of extension methods shown)
What removing the unused namespaces won't do:
alter in any way the output of the compiler.
alter in any way the execution of the compiled program (faster loading, or better performance).
The resulting assembly is the same with or without unused using(s) removed.
It won't change anything when your program runs. Everything that's needed is loaded on demand. So even if you have that using statement, unless you actually use a type in that namespace / assembly, the assembly that using statement is correlated to won't be loaded.
Mainly, it's just to clean up for personal preference.
Code cleanliness is important.
One starts to get the feeling that the code may be unmaintained and on the browfield path when one sees superfluous usings. In essence, when I see some unused using statements, a little yellow flag goes up in the back of my brain telling me to "proceed with caution." And reading production code should never give you that feeling.
So clean up your usings. Don't be sloppy. Inspire confidence. Make your code pretty. Give another dev that warm-fuzzy feeling.
There's no IL construct that corresponds to using. Thus, the using statements do not increase your application memory, as there is no code or data that is generated for it.
Using is used at compile time only for the purposes of resolving short type names to fully qualified type names. Thus, the only negative effect unnecessary using can have is slowing the compile time a little bit and taking a bit more memory during compilation. I wouldn't be worried about that though.
Thus, the only real negative effect of having using statements you don't need is on intellisense, as the list of potential matches for completion while you type increases.
You may have name clashes if you call your classes like the (unused) classes in the namespace. In the case of System.Text, you'll have a problem if you define a class named "Encoder".
Anyways this is usually a minor problem, and detected by the compiler.
Leaving extra using directives is fine. There is a little value in removing them, but not much. For example, it makes my IntelliSense completion lists shorter, and therefore easier to navigate.
The compiled assemblies are not affected by extraneous using directives.
Sometimes I put them inside a #region, and leave it collapsed; this makes viewing the file a little cleaner. IMO, this is one of the few good uses of #region.
if you want to maintain your code clean, not used using statements should be removed from the file. the benefits appears very clear when you work in a collaborative team that need to understand your code, think all your code must be maintained, less code = less work, the benefits are long term.
Your application will not use more memory. It's for the compiler to find classes you use in the code files. It really doesn't hurt beyond not being clean.
It’s personal preference mainly. I clean them up myself (ReSharper does a good job of telling me when there’s unneeded using statements).
One could say that it might decrease the time to compile, but with computer and compiler speeds these days, it just wouldn’t make any perceptible impact.
They are just used as a shortcut. For example, you'd have to write:
System.Int32 each time if you did not have a using System; on top.
Removing unused ones just makes your code look cleaner.
The using statement just keeps you from qualifying the types you use. I personally like to clean them up. Really it depends on how a loc metric is used
Having only the namespaces that you actually use allows you to keep your code documented.
You can easily find what parts of your code are calling one another by any search tool.
If you have unused namespaces this means nothing, when running a search.
I'm working on cleaning up namespaces now, because I'm constantly asked what parts of the application are accessing the same data one way or another.
I know which parts are accessing data each way due to the data access being separated by namespaces e.g. directly through a database and in-directly through a web service.
I can't think of a simpler way to do this all at once.
If you just want your code to be a black box (to the developers), then yes it doesn't matter. But if you need to maintain it over time it is valuable documentation like all other code.
The 'using' statement does not affect performance as it is merely a helper in qualifying the names of your identifiers. So instead of having to type, System.IO.Path.Combine(...), you can simply type, Path.Combine(...) if you have using System.IO.
Do not forget that the compiler do a lot of work to optimize everything when building your project. Using that is used in a lot of place or 1 shouldn't do a different once compiled.

Categories

Resources