Assembly.ReflectionOnlyLoadFrom(assemblyPath).GetName() VS Assembly.LoadFrom(assemblyPath).GetName()

Assembly.ReflectionOnlyLoadFrom(assemblyPath).GetName() VS Assembly.LoadFrom(assemblyPath).GetName() - c#

We have twice the same dlls in separate folders.
When we load the second dlls using
Assembly.ReflectionOnlyLoadFrom(assemblyPath)
we get an error :
"API restriction: The assembly 'file.dll' has already loaded from a different location. It cannot be loaded from a new location within the same appdomain."
Which is understandable but when we do :
Assembly.LoadFrom(assemblyPath);
it works fine.
Why ? what whould change using the "ReflectionOnly" method.
In our case the only usage would be to use the GetName() method on the result, and I guess that in this case, the result should be strictly the same ?
Thanks

When you load an assembly for ReflectionOnly, only the metadata gets loaded. This allows you to inspect its types, but not instantiate or execute any of them.
There's also a property indicating whether an assembly was loaded for reflection only.
So per AppDomain, an assembly can be loaded once: either fully, or for reflection only. Given it's already fully loaded, you can't load it again for reflection only.
The call to Assembly.LoadFrom(), even when provided two different paths, will only load the same assembly once, as long as they match in version. See also Side effects of calling Assembly.Load multiple times.
See also: MSDN: How to: Load Assemblies into the Reflection-Only Context.

Related

How to find a type that is not loaded yet in AppDomain?

I'm developing modular application using WPF and Prism.
All my UserControls have separate assemblies and implement IUserControl interface.
I would like to list all Types which implement IUserControl interface form a loaded module library in this way;
//ModuleA.cs
var interfaceType = typeof(IUserControl);
var userControlTypes = AppDomain.CurrentDomain.GetAssemblies()
.SelectMany(s => s.GetTypes())
.Where(p => interfaceType.IsAssignableFrom(p) && p.IsClass);
But I cannot see all UserControl types implementing IUserControl in userControlTypes list.
When I use the all classes that implements IUserControl in Bootstrapper.cs like in the following;
var userControlTypes = new List<Type>()
{
{typeof(HastaKayitControl)},
{typeof(ViziteUserControl)},
{typeof(DenemeUserControl)},
...
};
I can get all desired UserControls from the list just I wrote above(userControlTypes).
What is the reason behind this?
FYI:
All assemblies target the same .NET framework version.
My Prism version is 6.1.0
I will use userControlTypes to show all UserControl types inside the application to the end-user.
IUserControl interface contains nothing.

This behavior is by design. The .net CLR will not load in an assembly unless it is called/entered which forces it to be loaded. Imagine the startup cost of running an app if every .dll file in the directory were loaded into memory when the application started as opposed to when a type was referenced at run time for the first time, some apps with large libraries would have load times of minutes (maybe even more?). Also it would not be realistic because some types are resolved to libraries outside of the execution folder like assemblies that resolve to the GAC.
In your first example AppDomain.CurrentDomain.GetAssemblies will only return the loaded assemblies, not all the assemblies, in that application domain. To see this you could add a {typeof(ViziteUserControl)} (taken from your next code part) and place it right above it, this will force the type (and containing assembly) to be loaded by the CLR and now it (types containing assembly) too will be returned by AppDomain.CurrentDomain.GetAssemblies.
In your next code fragment your code is explicitly entering these assemblies and adding the types. I do not think this requires any explaining.
So if you want AppDomain.CurrentDomain.GetAssemblies to load all your types across your application you need to force the assembly to load into memory if it has not already done so. Depending on your structure you could do this a couple of ways.
Iterate through the .dll files on disk (using a reference location like Assembly.GetExecutingAssembly.Location) and call Assembly.LoadFrom. Use wild cards to ensure you are only loading your assemblies and not every .dll library you are encountering.
Reference interested types in a configuration file and load them from there. You can use Type t = Type.GetType(yourConfigType); when creating your list of types from your configuration string list.
Reference interested assemblies in a configuration file and load in the DLL in the same manner as option 1.
Just hard code the list as you did in your last example.
If you choose option 1 or 3 you will have to check to make sure you have not already loaded the assembly in memory before you call Assembly.LoadFrom. You can do this by again checking what is already loaded with AppDomain.CurrentDomain.GetAssemblies().Any(x =>your search query).
Also Note that once you load an assembly into your application domain you cannot unload it for the life of that application domain. If you do not want this but you still want to dynamically find all your types you will have to create a 2nd application domain to find all the types and return them as an array/list of fully qualified type name as a string. You can then unload this created application domain. Also, as correctly noted by #Peter below in the comments, use ReflectionOnlyLoadFrom if you go with this approach. This incurs much less overhead.

AppDomain.GetAssemblies() tells you the loaded assemblies, not the referenced ones. I can't speak to the Prism aspect of your question, and I agree with the comments that there is probably a better way to design this. But…
If you really want to enumerate all of the types that might get loaded in your AppDomain, you can approximate this by enumerating the types in the existing assemblies (i.e. as you've done here, with AppDomain.CurrentDomain.GetAssemblies(), but then for each assembly, call GetReferencedAssemblies()), which returns an array of AssemblyName values that you can use to load additional assemblies. For each of those, you can in turn inspect all of their types (to find the implementors of IUserControl) and to call GetReferencedAssemblies() to continue the recursive search.
Note that this still will not necessarily return all implementors of the IUserControl interface that your process might load. Assemblies can be loaded by means other than being referenced in your AppDomain's assemblies, such as by code searching a directory for candidates, or even the user explicitly naming an assembly to load. This is why using mechanisms directly supported by whatever API you're using is a much better approach, to make sure that you find exactly those assemblies that that API would find.

Is it safe to call Type.GetType with an untrusted type name?

I came across the following in a code review:
Type type = Type.GetType(typeName);
if (type == typeof(SomeKnownType))
DoSomething(...); // does not use type or typeName
typeName originates from an AJAX request and is not validated. Does this pose any potential security issues? For example, is it possible for unexpected code to be executed, or for the entire
application to crash (denial of service), as the result of loading arbitrary types from arbitrary assemblies?
(I suppose some joker could attempt to exhaust available memory by loading every type from every assembly in the GAC. Anything worse?)
Notes:
This is an ASP.NET application running under Full Trust.
The resulting type is only used as shown above. No attempt is made to instantiate the type.

No, this is not safe at all. Type.GetType will load an assembly if it has not been loaded before:
GetType causes loading of the assembly specified in typeName.
So what's wrong with loading an assembly? Aside from it using additional memory as Daniel points out, .NET assemblies can execute code when they load, even though this functionality is not exposed to normal compilers like C# and VB.NET. These are called module initializers.
The module’s initializer method is executed at, or sometime before, first access to any types, methods, or data defined in the module
Just the fact that you are loading an assembly and examining its types is enough to get the module initializer to run.
Someone with a cleverly written assembly (say by using ilasm and writing raw MSIL) can execute code just by getting the assembly loaded and you examining the types. That's why we have Assembly.ReflectionOnlyLoad, so we can safely load the assembly in a non-executable environment.
I did a little more thinking about this and thought of a few more cases.
Consider that your Application Pool is set to run 64-bit. Now imagine that your attacker uses the AJAX service to attempt to load an assembly that is strictly for x86 architecture only. For example, there is one in my GAC called Microsoft.SqlServer.Replication that is x86 only, there is no AMD64 counter-part. If I ask your service to load that assembly, you'd get a BadImageFormatException. Depending on what guard clauses you have in place around loading the assembly, unhandled exceptions could completely bring down your AppPool.

It could eat up memory potentially if the libraries aren't in memory.
I would have a Dictionary<string, Type> as an allowed list.
var whitelist = new Dictionary<string, Type>;
whitelist.Add("MyType", typeof(MyType));

There is no inherit danger in referring to the type itself. Trust in .NET is at the assembly level. If there's no available assembly that contains the specified type, your call will just return a null. Accordingly, somebody has to make the assembly available for the code -- assemblies don't just appear out of thin air.

Should I rely on dynamically loaded assemblies?

I have the following situation.
Assembly D contains class Data.
Assembly F1 contains class, which creates, fills and returns Data.
Assembly F2 contains class, which accepts Data as input.
The trick is, that all of these assemblies are plugins and are loaded dynamically. Of course both F1 and F2 references D, but in the runtime all three are loaded by host application.
Now what happens if someone replaces D binary file with newer version, which has a different interface?
I wrote a test application, which did something like that, with the following results:
Adding new field in class Data causes no exception;
Replacing the existing field with another one results in TargetInvocationException with information, that requested field does not exist
If .NET keeps track of the interface calls, I'm fine. That's because accessing the unchanged part of library will simply work and if that part changes, I'll get an exception simply telling me that. So it will either work (on the interface level) or not - no undefined behavior.
My questions:
How are the types resolved in the runtime - especially in case of non-matching assembly versions? Does .NET keep track of field/property/parameter/return value types and names?
Is there a way to force the referenced assembly to be required in some specific version?

Your second question: In Visual Studio there is a easy way to force your application to use a specific version of a referenced assembly. Simply click on the assembly under references and look at the properties. There is a property called "Specific version". If you set it to true and load another at run time, you will get an exception.
Your first question: I don't know exactly how .Net determines, when it is able to use an assembly, that it was not compiled with. So if the signature of a class/interface changes, .Net will throw an exception. I think .Net simply tries to use the new assembly and throws an exception if the methods/properties in the class/interface in the new assembly has a modified signature.

Using reflection on files already loaded in an app domain instead of loading every file (again)

I'm using reflection to scan all of the assemblies in a folder for types that implement a certain interface and derive from a certain base class. The code looks like this:
foreach (string file in Directory.GetFiles(folder, "*.dll"))
{
Assembly assembly = Assembly.LoadFile(file);
Type foundType = (from type in assembly.GetTypes()
where type.GetInterfaces().Contains(typeof(TInterface))
&& type.BaseType.Name.LeftOf('`') == baseClass.Name.LeftOf('`')
select type).FirstOrDefault();
if (foundType == default(Type)) { continue; }
// Register our type so we don't need to use reflection on subsequent requests.
DependencyContainer.Register(typeof(TInterface), foundType);
return CreateInstance<TInterface>(foundType);
}
During a code review, two concerns were brought up concerning this piece of code. First, we can't shortcut the loop once we find a matching type; we need to loop through every file and throw an exception if we find more than one matching type. That brings me to the real issue here...
The code reviewer wondered if there was a better way to load every file. For performance reasons, we're wondering if we can loop through files already loaded in the app domain, instead of calling Assembly.LoadFile(file) for every file. We thought, why load every file if it's already loaded? Is this a valid concern? Is loading a file this way the same as how files get loaded into an app domain? What would be an efficient way to loop through every file so we're not wasting processing time?
Note: The documentation for Assembly.LoadFile() isn't quite helpful:
Loads the contents of an assembly file on the specified path.
I'm not sure if that equates to how files are loaded into an app domain, or if that's a different scenario altogether.

If you use LoadFrom instead of LoadFile, you don't have to worry about that - if the DLL is already loaded, it will not be loaded again - see http://msdn.microsoft.com/en-us/library/1009fa28.aspx. However, note that this is based on the assembly identity, not path, so if you're concerned that there could be two assemblies, each with the same identity but a different path, you're stuck with loading them explicitly each time.
If you really want to dig deeper into this, you can get all the assemblies loaded in your application domain using AppDomain.CurrentDomain.GetAssemblies, building a dictionary or some such structure and skipping those already loaded. However, as I said, in a typical scenario and using LoadFrom, it's unnecessary.

I don't know exactly how Assembly.LoadFile(file) behaves. But you can always check each individual assembly against the already loaded ones via AppDomain.CurrentDomain.GetAssemblies().

Another approach is to add the folder to the private probing path and then use Assembly.Load(string) which loads the assemblies in the Load context. This is the recommended way as far as I know. Check the MSDN blog of Suzanne Cook for more info and advice on assembly loading.

How does Assembly.Load(byte[]) work?

I was just wondering what happens if I was to load the same assembly bytes twice within a web app.
For example I have this code
byte[] assem = System.IO.File.ReadAllBytes(appRoot + "/Plugins/Plugin.dll");
var loadedAssem = Assembly.Load(assem);
var plugin = loadedAssem.CreateInstance("Plugin.ThePlugin") as IPlugin;
I ran this code and on the first request I assume it would load the assembly into ram ( or the http runtime appdomain? ) and then I can create instances of whatever is in there.
If I ran this code again, say on the second request what would happen to the assembly on the first request?
Would is still exist in ram? if so how does it differentiate between the two assemblies? or does it overwrite the previously declare classes?
This is for my understanding, as like I do in PHP its not just a case of "require_once".

This will load two distinct copies of the assembly, each of which can be used from your application. The types in each assembly are distinct types and will not inter-operate with one-another. For instance, if you take a Widget from Copy1 and try to pass it to a method that takes a Widget on Copy2, this will cause a runtime failure. It is not possible to unload assemblies once they have been loaded in this way (i.e. into your main AppDomain.)
Regarding instantiation:
If you use Assembly.CreateInstance (as shown in your post), this will create it from the Assembly instance you used to make the call.
If you use an Activator.CreateInstance that takes a string, you need to specify the assembly name. Since both loaded assemblies will have the same name in this case, it will use assembly resolution rules, which, I think by default, will favor the first match (so the assembly you loaded first.) I'm not certain of this. You can hook the AppDomain.AssemblyResolve event to provide your own prioritization and make it use your most-recently-loaded assembly.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.