Checking if a file is a .NET assembly

Checking if a file is a .NET assembly - c#

I've seen some methods of checking if a PEFile is a .NET assembly by examining the binary structure.
Is that the fastest method to test multiple files? I assume that trying to load each file (e.g. via Assembly.ReflectionOnlyLoad) file might be pretty slow since it'll be loading file type information.
Note: I'm looking for a way to check files programmatically.

I guess Stormenet's answer isn't technically programmatic, so I'll seperate my response into an answer.
For best performance, nothing is going to beat opening the file(s) with a StreamReader, reading the first (n) bytes and checking for the .NET file signature data structures in the byte stream.
Pretty much the same way you'd verify something is a DOS executable:
http://en.wikipedia.org/wiki/DOS_executable
Look for the "MZ" header bytes, which also happen to be the initials of Mark Zbikowski, one of the developers of MS-DOS..

Maybe this helps
from https://web.archive.org/web/20110930194955/http://www.grimes.demon.co.uk/dotnet/vistaAndDotnet.htm
Next, I check to see if it is a .NET assembly. To do this I check to see if the file contains the CLR header. This header contains important information about the location of the .NET code in the file and the version of the framework that was used to write that code. The location of this header is given in the file's Data Directory table. If the data directory item has zero values then the file is unmanaged, if it has non-zero values then the file is a .NET assembly.
You can test this yourself using the dumpbin utility with the /headers switch. This utility will print the various headers in a file on the command line. At the end of the Optional Header Values you'll see a list of the Data Directories (there will always be 16 of them) and if the COM Descriptor Directory has a non-zero location it indicates that the file is a .NET assembly. The contents of the CLR header can also be listed using the /clrheader switch (if the file is unmanaged this will show no values). XP tests for the CLR header when it executes a file and if the CLR header is present it will initialize the runtime and pass the entry point of the assembly to the runtime, so that the file runs totally within the runtime.

In the past I've used AssemblyName.GetAssemblyName(), which throws an exception if it's not a managed assembly. However, I've never performance tested it, so I can't say how fast it is.
Official Documentation

The first link there is going to be the fastest and simplest method of checking (the PE file header one). You're correct in assuming that calling Assembly.ReflectionOnlyLoad is going to be pretty slow.

Related

How can I programmatically add content to new/existing executable using c# and possibly make the solution work on a mac?

I've seen a number of variations on this question and im not sure if this question has been completely duplicated.
I would like to be able to at run-time run an existing executable (SOURCE exe) and have it:
1) take an existing TARGET exe at run time and add content of any size and type to the TARGET exe (pdf, image, word, excel file type, etc)
2) be able to run the modified TARGET exe so that when the TARGET exe is run, it will find the embedded content inside of itself and copy the content to the hard drive and then run the program associated with the content (foe example, run excel on a copied xls file)
I've seen examples where you embed resources at compile time in visual studio but I want to do this at run-time in code (c#, java, whatever works). Either the host TARGET exe needs to already exist and content should be added to it OR the exe will need to be generated from scratch at run-time and content again added to it.
I also would prefer not to use any of the cmd-line tools that visual studio or any other tool would run behind the scenes (if possible) to create an exe to minimize the enduser needing to download any more libraries/sdks than necessary.
This product is in line with what i want to do
http://www.boomeranglistbuilder.com/instructions/usingsoftware.php
(I want to improve upon it) :)
Lastly it'd be great if the solution could be cross platform compatible (doubt it though)
Could this be done in java?
I've seen the window library resource method updateresource method mentioned in my searches but I'm not sure if that would completely fit my situation. can anyone comment?
I hope my question is clear. Please let me know.
Any help would be graciously appreciated.
Thank you,
Carlos

I think that it's true for most binary file formats (including the executables), appending data to a well-formed file will not affect the usage of the file, the way it is typically interpreted by most programs. You could, maybe, take advantage of this.
To embed, you'll need to take your (existing) target executable and simply append some binary data to it. That data will have two parts:
A magic word (to denote the presence of an appended resource)
The resource itself.
So, this:
[target executable data]
Becomes this:
[target executable data]
[magic word]
[resource]
To read the resource from the target executable, simply have that executable open itself, search for the magic word and, if it's present, start reading the resource appended after it.
This is what WinRAR does (or at least did four years ago, when I last checked) to recognize the archives inside of its self-extracting files.

How to include source code in dll?

Short version:
I want my program to be able to (read-only-)access its own source code during runtime. Is there a way to automatically package the source code into the dll during compilation?
Long version:
The background is that when an exception occurs, I want to automatically create a file containing some details of what happened. This file should, among other things, include the source code of the function that caused the problem. I want to send this file to other people by email, and the receiver will most likely not have (or not want to install) Visual Studio, so anything using symbol servers and the likes is out of question. It needs to be a plain text file.
Ideally I would somewhere find the related source code files and just copy out the relevant lines. You can safely assume that as long as I have a folder containing the entire source code, I will be able to locate the file and lines I want.
The best idea I could come up with so far -- and I have not looked into it in much detail because it seems messy to no end -- is to modify the MSBuild files to create a .zip of the source during compilation, and require .dll and .zip to reside in the same folder.
Most of the similar-sounding questions on stackoverflow that I found seem to deal with decompiling .dll files, which is not what I want to do. I have the source code, and I want to ship it together with the .dll in a convenient way.
Update: The really long version
Seems some people are seriously questioning why I would want to do that, so here's the answer: The main purpose of my software is testing some other software. That other software has a GUI. For an easy example, let's say the other software were the standard Windows calculator, then my testcase might look something like this:
calculator.Open();
calculator.EnterValue(13);
calculator.PressButtonPlus();
calculator.EnterValue(38);
calculator.PressButtonEnter();
int value = calculator.GetDisplayedValue();
Assert.That(value == 51);
calculator.Close();
These tests are intentionally written in a very human-readable way.
What I want to do when a problem occurs is to give the developer of the calculator a detailed description of how to reproduce the problem, in a way that he could reproduce by hand, without my software. (In this example, he would open the calculator, enter 13, press plus, and so on.)
Maybe another option would be to have each function calculator.Something() write out an information line to a log, but that would a) be a lot more work, b) only include the test run up to the point where it aborted, and c) bear some risk that writing the line is forgotten in one function, thereby giving an incorrect representation of what was done. But I'm open to other solutions than copying source code.

Take a look at this question: Do __LINE__ __FILE__ equivalents exist in C#?
C++ offers macros (__LINE__, __FILE__, and so on) that replace with the representing information during compile time. This means if you write something like this:
throw new CException(__FILE__);
it will compile to something like this:
throw new CException("test.cpp");
resulting in a hardcoded value. The C# compiler does not offer those macros and you are forced to use reflection to get the information about where the exception has been thrown. The way you can do it is described in the question linked above.
If you are not able to supply .pdb symbols then the default behaviour of Exception.ToString() (or StackTrace.ToString()) will not return you the line number, but the MSIL offsets of the operation that failed. As far as I can remember you can use the Stack Trace Explorer of ReSharper to navigate to the representing code (Not 100% sure about that, but there also was a question here on stackoverflow that mentioned this fact).

You can include copies of the source files as resources.
In the project folder, create a subfolder named Resources. Copy the source files there.
Create in the project a resource file, and then include the source copies you made into it.
Setup a pre-build event to copy the actual source files to Resources folder, so you always have updated copies. In the example I created, this worked well:
copy $(ProjectDir)*.cs $(ProjectDir)Resources
In your code, now you can get the content of the files like this (I suppose the name of the resources file is Resource1.resx:
// Get the source of the Program.cs file.
string contents = Resource1.Program;
The project ended up like this:

Yes, I also recommend packing up the sources in a .zip or whatever during MSBuild, and packaging that .zip with your application/dll. In runtime, when an exception occurs, you get the file and method name like Aschratt describes, extract the file from the .zip and find the method in it.

process.start() embedded exe without extracting to file first c#

I have an executable embedded into my app resources. ATM I use assembly reflection to extract the executable to its own file and then start the executive using process,START(). Is it possible to run the embedded executable straight from a stream instead of writing it to file first? Could someone please show me the most efficient way to do this please.

Here's what I gather from your question, and your comments:
You want to know if it is possible to execute an executable embedded into your program, without extracting it to disk first
Your program is a .NET program
The executable you want to execute is not a .NET program
The answer to that is: yes
However, the answer to that is also it is very, very, hard
What you have to do is, and note that I do not know all the details about this since I don't do this, but anyway:
Load the executable code into memory
Remap all addresses in the binary image so that they're correct in relation to the base address you loaded the executable at
Possibly load external references, ie. other DLL's that executable need
Remap the addresses of those references
Possibly load references needed by the just loaded referenced DLL's
Remape those dll's
Repeat 3 through 6 until done
Call the code
I'm assuming your question is "can I do 1 and 8", and the answer to that is no.

if it's a .net executable, you should be able to load it into an appdomain and start it that way:
http://msdn.microsoft.com/en-us/library/system.reflection.assembly.load.aspx

Very simple actually:
byte[] bytes = File.ReadAllBytes(path);
a = Assembly.Load(bytes);
Now instead of reading the bytes from a file, read it from the resource and you're done. Actually there is a very good article on that: Dynamically Loading Embedded Resource Assemblies

If you don't want it on a hard drive, you could possible look at saving it to a ram-drive and then run it from there.

It can be done without your native EXE having to touch the disk.
See here....it shows an example of a "process" image being embedded as a Resource. It's read into memory, and then CreateProcess and a number of other things are done to build a valid running "process".
http://www.rohitab.com/discuss/topic/31681-c-run-program-from-memory-and-not-file/

Plugin compiler structure?

Im making a modular program, and it supports dynamic compilation of source files in the plugin directory.
What I would like to do, to speed up loading times, is save compiled assemblies to a separate folder.
When my program loads, and comes across a source file to compile, i would like it to check if there is an already compiled assembly, and use it IF the source file has not been changed since then. If the source file is changed, then re-compile and override the saved assembly.
My question to you is, what would be an effective way, to track which source file belongs to which assembly, and an effective way to track whether a source file has been changed since last load or not.

Change Tracking: Keep MD5 / CRC Hashes of the source files on record & Last Modified date, correlate the two to determine if the files have changed.
As for source->assembly, I suggest convention over configuration.

Why would you want to do that exactly? Anyone who is going to be using C# to write a plugin for you is going to know how to use Visual Studio and build a DLL. You'd be much better off defining a DLL interface for plugins to use instead. Then you wouldn't have to worry about loading times of any sort.
If the "plugin" is supposed to be changed from inside your program itself, you should probably just compile when the plugin is changed in your program rather than attempting to see when things get changed.

The only way (that I can think of) to guarantee this 100% is to keep a copy of the original source file along with the compiled version. Then you can do a file comparison between the original and the new one to see if they've changed, and if so, perform your compilation.

My question to you is, what would be an effective way, to track which source file belongs to which assembly, and an effective way to track whether a source file has been changed since last load or not
One way could be to:
Use FileSystemWatcher to track file changes
Keep an list(or table) to maintain source-assembly file associations.

How to execute an executable embedded as resource

Is it possible to execute an exe file that is included in the project as a resource? Can I fetch the file as a byte array and execute it in memory?
I don't want to write the file to a temporary location and execute it there. I'm searching for a solution where I can execute it in memory. (It's not a .NET assembly.)

It's quite possible - I've done it myself - but it's fiddly and more so from managed code. There's no .NET API for it, nor is there a native API for it which you can PInvoke. So you'll have to fenagle the load by hand, which will require some knowledge of the PE (Portable Executable) file format used for modules such as DLLs and EXEs - http://msdn.microsoft.com/en-us/magazine/cc301805.aspx. There'll be a lot of pointer manipulation (mandating use of unsafe {} blocks) and PInvoke.
First load the PE file into memory (or use MapViewOfFile). A PE file is internally made up of different sections containing code, data or resources. The offsets of each section in the file don't always match intended in-memory offsets, so some minor adjustments are required.
Every PE file assumes it'll be loaded at a certain base address in virtual memory. Unless you can ensure this you'll need to walk the PE file's relocation table to adjust pointers accordingly.
Each PE file also has an import table listing which other DLLs' functions it wants to call. You'll need to walk this table and call LoadLibrary() / GetProcAddress() to fill in each import.
Next, memory protection needs to be set correctly for each section. Each section's header notes the protection it wants, so it's just a matter of calling VirtualProtect() for each section with the correct flags. At a minimum you'll need to VirtualProtect the loaded module with PAGE_EXECUTE_READWRITE or you're unlikely to be able to execute any code.
Lastly for a DLL you need to call its entry point, whose address can be found in the PE header; you can then freely call exported functions.
Since you want to run an EXE, you've got some additional headaches. You can just spin up a new thread and call the EXE's entry point from it, but many EXE's may get upset since the process is set up for you, not the EXE. It also may well kill your process when it tries to exit. You might want to spawn a new process therefore - perhaps another copy of your main EXE with special arguments to tell it it's going to run some different code - in which case you'd have to fenagle the EXE into its memory space. You'd probably want to do most of the above work in the new process, not the old. You could either create a named pipe and send the data across from one EXE to the other, or allocate a named shared memory area with MapViewOfFile. Of course the EXE may still get upset since the process its running in still isn't its own.
All in all its far easier just to write to a temporary file and then use Process.Start().
If you still want to do it the hard way, take a look at this example in unmanaged code: http://www.joachim-bauch.de/tutorials/loading-a-dll-from-memory/. This doesn't cover executables, just DLLs, but if the code therein doesn't scare you you'd be fine extending the process to cover executables.

A much better way is to create a temporary DLL file with FILE_FLAG_DELETE_ON_CLOSE attribute. This way the file will be deleted automatically when it is no longer used.
I don't think there is a way to load DLL from a memory (rather than a file).

Its not very easy to create a new process from a memory image, all of the kernel functions are geared towards loading an image from disk. See the process section of the Windows NT/2000 native API reference for more information - page 161 has an example of manually forking a process.
If it is ok to run the code from within your own process then you can create a small DLL that will take a pointer to executable data and run it.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.