How does Assembly.Load(byte[]) work?

How does Assembly.Load(byte[]) work? - c#

I was just wondering what happens if I was to load the same assembly bytes twice within a web app.
For example I have this code
byte[] assem = System.IO.File.ReadAllBytes(appRoot + "/Plugins/Plugin.dll");
var loadedAssem = Assembly.Load(assem);
var plugin = loadedAssem.CreateInstance("Plugin.ThePlugin") as IPlugin;
I ran this code and on the first request I assume it would load the assembly into ram ( or the http runtime appdomain? ) and then I can create instances of whatever is in there.
If I ran this code again, say on the second request what would happen to the assembly on the first request?
Would is still exist in ram? if so how does it differentiate between the two assemblies? or does it overwrite the previously declare classes?
This is for my understanding, as like I do in PHP its not just a case of "require_once".

This will load two distinct copies of the assembly, each of which can be used from your application. The types in each assembly are distinct types and will not inter-operate with one-another. For instance, if you take a Widget from Copy1 and try to pass it to a method that takes a Widget on Copy2, this will cause a runtime failure. It is not possible to unload assemblies once they have been loaded in this way (i.e. into your main AppDomain.)
Regarding instantiation:
If you use Assembly.CreateInstance (as shown in your post), this will create it from the Assembly instance you used to make the call.
If you use an Activator.CreateInstance that takes a string, you need to specify the assembly name. Since both loaded assemblies will have the same name in this case, it will use assembly resolution rules, which, I think by default, will favor the first match (so the assembly you loaded first.) I'm not certain of this. You can hook the AppDomain.AssemblyResolve event to provide your own prioritization and make it use your most-recently-loaded assembly.

Related

Assembly.ReflectionOnlyLoadFrom(assemblyPath).GetName() VS Assembly.LoadFrom(assemblyPath).GetName()

We have twice the same dlls in separate folders.
When we load the second dlls using
Assembly.ReflectionOnlyLoadFrom(assemblyPath)
we get an error :
"API restriction: The assembly 'file.dll' has already loaded from a different location. It cannot be loaded from a new location within the same appdomain."
Which is understandable but when we do :
Assembly.LoadFrom(assemblyPath);
it works fine.
Why ? what whould change using the "ReflectionOnly" method.
In our case the only usage would be to use the GetName() method on the result, and I guess that in this case, the result should be strictly the same ?
Thanks

When you load an assembly for ReflectionOnly, only the metadata gets loaded. This allows you to inspect its types, but not instantiate or execute any of them.
There's also a property indicating whether an assembly was loaded for reflection only.
So per AppDomain, an assembly can be loaded once: either fully, or for reflection only. Given it's already fully loaded, you can't load it again for reflection only.
The call to Assembly.LoadFrom(), even when provided two different paths, will only load the same assembly once, as long as they match in version. See also Side effects of calling Assembly.Load multiple times.
See also: MSDN: How to: Load Assemblies into the Reflection-Only Context.

How to find a type that is not loaded yet in AppDomain?

I'm developing modular application using WPF and Prism.
All my UserControls have separate assemblies and implement IUserControl interface.
I would like to list all Types which implement IUserControl interface form a loaded module library in this way;
//ModuleA.cs
var interfaceType = typeof(IUserControl);
var userControlTypes = AppDomain.CurrentDomain.GetAssemblies()
.SelectMany(s => s.GetTypes())
.Where(p => interfaceType.IsAssignableFrom(p) && p.IsClass);
But I cannot see all UserControl types implementing IUserControl in userControlTypes list.
When I use the all classes that implements IUserControl in Bootstrapper.cs like in the following;
var userControlTypes = new List<Type>()
{
{typeof(HastaKayitControl)},
{typeof(ViziteUserControl)},
{typeof(DenemeUserControl)},
...
};
I can get all desired UserControls from the list just I wrote above(userControlTypes).
What is the reason behind this?
FYI:
All assemblies target the same .NET framework version.
My Prism version is 6.1.0
I will use userControlTypes to show all UserControl types inside the application to the end-user.
IUserControl interface contains nothing.

This behavior is by design. The .net CLR will not load in an assembly unless it is called/entered which forces it to be loaded. Imagine the startup cost of running an app if every .dll file in the directory were loaded into memory when the application started as opposed to when a type was referenced at run time for the first time, some apps with large libraries would have load times of minutes (maybe even more?). Also it would not be realistic because some types are resolved to libraries outside of the execution folder like assemblies that resolve to the GAC.
In your first example AppDomain.CurrentDomain.GetAssemblies will only return the loaded assemblies, not all the assemblies, in that application domain. To see this you could add a {typeof(ViziteUserControl)} (taken from your next code part) and place it right above it, this will force the type (and containing assembly) to be loaded by the CLR and now it (types containing assembly) too will be returned by AppDomain.CurrentDomain.GetAssemblies.
In your next code fragment your code is explicitly entering these assemblies and adding the types. I do not think this requires any explaining.
So if you want AppDomain.CurrentDomain.GetAssemblies to load all your types across your application you need to force the assembly to load into memory if it has not already done so. Depending on your structure you could do this a couple of ways.
Iterate through the .dll files on disk (using a reference location like Assembly.GetExecutingAssembly.Location) and call Assembly.LoadFrom. Use wild cards to ensure you are only loading your assemblies and not every .dll library you are encountering.
Reference interested types in a configuration file and load them from there. You can use Type t = Type.GetType(yourConfigType); when creating your list of types from your configuration string list.
Reference interested assemblies in a configuration file and load in the DLL in the same manner as option 1.
Just hard code the list as you did in your last example.
If you choose option 1 or 3 you will have to check to make sure you have not already loaded the assembly in memory before you call Assembly.LoadFrom. You can do this by again checking what is already loaded with AppDomain.CurrentDomain.GetAssemblies().Any(x =>your search query).
Also Note that once you load an assembly into your application domain you cannot unload it for the life of that application domain. If you do not want this but you still want to dynamically find all your types you will have to create a 2nd application domain to find all the types and return them as an array/list of fully qualified type name as a string. You can then unload this created application domain. Also, as correctly noted by #Peter below in the comments, use ReflectionOnlyLoadFrom if you go with this approach. This incurs much less overhead.

AppDomain.GetAssemblies() tells you the loaded assemblies, not the referenced ones. I can't speak to the Prism aspect of your question, and I agree with the comments that there is probably a better way to design this. But…
If you really want to enumerate all of the types that might get loaded in your AppDomain, you can approximate this by enumerating the types in the existing assemblies (i.e. as you've done here, with AppDomain.CurrentDomain.GetAssemblies(), but then for each assembly, call GetReferencedAssemblies()), which returns an array of AssemblyName values that you can use to load additional assemblies. For each of those, you can in turn inspect all of their types (to find the implementors of IUserControl) and to call GetReferencedAssemblies() to continue the recursive search.
Note that this still will not necessarily return all implementors of the IUserControl interface that your process might load. Assemblies can be loaded by means other than being referenced in your AppDomain's assemblies, such as by code searching a directory for candidates, or even the user explicitly naming an assembly to load. This is why using mechanisms directly supported by whatever API you're using is a much better approach, to make sure that you find exactly those assemblies that that API would find.

Is it safe to call Type.GetType with an untrusted type name?

I came across the following in a code review:
Type type = Type.GetType(typeName);
if (type == typeof(SomeKnownType))
DoSomething(...); // does not use type or typeName
typeName originates from an AJAX request and is not validated. Does this pose any potential security issues? For example, is it possible for unexpected code to be executed, or for the entire
application to crash (denial of service), as the result of loading arbitrary types from arbitrary assemblies?
(I suppose some joker could attempt to exhaust available memory by loading every type from every assembly in the GAC. Anything worse?)
Notes:
This is an ASP.NET application running under Full Trust.
The resulting type is only used as shown above. No attempt is made to instantiate the type.

No, this is not safe at all. Type.GetType will load an assembly if it has not been loaded before:
GetType causes loading of the assembly specified in typeName.
So what's wrong with loading an assembly? Aside from it using additional memory as Daniel points out, .NET assemblies can execute code when they load, even though this functionality is not exposed to normal compilers like C# and VB.NET. These are called module initializers.
The module’s initializer method is executed at, or sometime before, first access to any types, methods, or data defined in the module
Just the fact that you are loading an assembly and examining its types is enough to get the module initializer to run.
Someone with a cleverly written assembly (say by using ilasm and writing raw MSIL) can execute code just by getting the assembly loaded and you examining the types. That's why we have Assembly.ReflectionOnlyLoad, so we can safely load the assembly in a non-executable environment.
I did a little more thinking about this and thought of a few more cases.
Consider that your Application Pool is set to run 64-bit. Now imagine that your attacker uses the AJAX service to attempt to load an assembly that is strictly for x86 architecture only. For example, there is one in my GAC called Microsoft.SqlServer.Replication that is x86 only, there is no AMD64 counter-part. If I ask your service to load that assembly, you'd get a BadImageFormatException. Depending on what guard clauses you have in place around loading the assembly, unhandled exceptions could completely bring down your AppPool.

It could eat up memory potentially if the libraries aren't in memory.
I would have a Dictionary<string, Type> as an allowed list.
var whitelist = new Dictionary<string, Type>;
whitelist.Add("MyType", typeof(MyType));

There is no inherit danger in referring to the type itself. Trust in .NET is at the assembly level. If there's no available assembly that contains the specified type, your call will just return a null. Accordingly, somebody has to make the assembly available for the code -- assemblies don't just appear out of thin air.

Marshalling Assembly from another AppDomain

Is it possible to hold a reference to an Assembly from another appdomain without having that assembly loaded into the current appdomain?
I'm working on fixing a memory leak in a Windows Service that dynamically generates Assemblies and runs the dynamically generated code. The problem is the generated Assemblies are loaded into the Current app domain and can never be unloaded.
There is a method in one of the Windows Service libraries that has the follow signature:
public Assembly CreateMethod(ObservableCollection<Field> sourceFields, Field destinationField)
This method creates the code for the assembly and loads it with the CSScript library LoadMethod function:
result = CSScript.LoadMethod(scriptFunction.ToString());
Later this Assembly reference from CreateMethod is used to run a function inside the generated assembly.
public object Run(Field destinationField, ObservableCollection<LinkField> sourceLinkFields, DataRow mainRow, Assembly script) {
...
var method = script.GetStaticMethodWithArgs("*.a" + Id.ToString().Replace("-", String.Empty), argumentTypes.ToArray());
return method(arguments.ToArray());
}
I'm wondering if it is possible to load the dynamically generated assembly into another app domain and run them through some type of proxy without having it loaded into the current app domain.
Edit:
I want to know if I can use an Assembly class reference in one AppDomain when the assembly is loaded in another AppDomain. Looking at the MSDN documentation they show how to use MarshalByRefObject. Basically I am trying to avoid changing the signature to my CreateMethod function, however I may need to change it to return MarshalByRefObject if this is not possible.
Update:
I ended up putting the call to CSScript.LoadMethod in the other app domain where I keep a Dictionary I then made CreateMethod return a Guid instead of an Assembly and then I pass this Guid around until the Run call. The Run call now takes a Guid as an argument instead of an Assembly. Inside the Run call I pass the Guid to the other app domain, run the method, and return the result object through a class that inherits MarshalByRefObject.

If you don't want the dynamic assembly in your main AppDomain, you have to move CreateMethod to another AppDomain, because as soon as you have an instance of Assembly, it's been loaded. In other words, no it is not possible to hold a reference to an assembly in another application domain, only to call into that assembly across application domains.
Without changing the signature and a bunch of your code, it seems like you need to move the minimum amount: 1) assembly creation and 2) Run. Then have the implementation of Run marshall the results.
As far as CreateMethod I think you want a method in the other assembly to "wrap" CreateMethod and return some sort of token that can be passed to Run. It's almost like changing the signature in a way...

This is one of the major features of an AppDomain! Just go look at the documentation

How Do I Load an Assembly and All of its Dependencies at Runtime in C# for Reflection?

I'm writing a utility for myself, partly as an exercise in learning C# Reflection and partly because I actually want the resulting tool for my own use.
What I'm after is basically pointing the application at an assembly and choosing a given class from which to select properties that should be included in an exported HTML form as fields. That form will be then used in my ASP.NET MVC app as the beginning of a View.
As I'm using Subsonic objects for the applications where I want to use, this should be reasonable and I figured that, by wanting to include things like differing output HTML depending on data type, Reflection was the way to get this done.
What I'm looking for, however, seems to be elusive. I'm trying to take the DLL/EXE that's chosen through the OpenFileDialog as the starting point and load it:
String FilePath = Path.GetDirectoryName(FileName);
System.Reflection.Assembly o = System.Reflection.Assembly.LoadFile(FileName);
That works fine, but because Subsonic-generated objects actually are full of object types that are defined in Subsonic.dll, etc., those dependent objects aren't loaded. Enter:
AssemblyName[] ReferencedAssemblies = o.GetReferencedAssemblies();
That, too, contains exactly what I would expect it to. However, what I'm trying to figure out is how to load those assemblies so that my digging into my objects will work properly. I understand that if those assemblies were in the GAC or in the directory of the running executable, I could just load them by their name, but that isn't likely to be the case for this use case and it's my primary use case.
So, what it boils down to is how do I load a given assembly and all of its arbitrary assemblies starting with a filename and resulting in a completely Reflection-browsable tree of types, properties, methods, etc.
I know that tools like Reflector do this, I just can't find the syntax for getting at it.

Couple of options here:
Attach to AppDomain.AssemblyResolve and do another LoadFile based on the requested assembly.
Spin up another AppDomain with the directory as its base and load the assemblies in that AppDomain.
I'd highly recommend pursuing option 2, since that will likely be cleaner and allow you to unload all those assemblies after. Also, consider loading assemblies in the reflection-only context if you only need to reflect over them (see Assembly.ReflectionOnlyLoad).

I worked out Kent Boogaart's second option.
Essentially I had to:
1.) Implement the ResolveEventHandler in a separate class, inheriting from MarshalByRefObject and adding the Serializable attribute.
2.) Add the current ApplicationBase, essentially where the event handler's dll is, to the AppDomain PrivateBinPath.
You can find the code on github.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.