Plugin compiler structure? - c#

Im making a modular program, and it supports dynamic compilation of source files in the plugin directory.
What I would like to do, to speed up loading times, is save compiled assemblies to a separate folder.
When my program loads, and comes across a source file to compile, i would like it to check if there is an already compiled assembly, and use it IF the source file has not been changed since then. If the source file is changed, then re-compile and override the saved assembly.
My question to you is, what would be an effective way, to track which source file belongs to which assembly, and an effective way to track whether a source file has been changed since last load or not.

Change Tracking: Keep MD5 / CRC Hashes of the source files on record & Last Modified date, correlate the two to determine if the files have changed.
As for source->assembly, I suggest convention over configuration.

Why would you want to do that exactly? Anyone who is going to be using C# to write a plugin for you is going to know how to use Visual Studio and build a DLL. You'd be much better off defining a DLL interface for plugins to use instead. Then you wouldn't have to worry about loading times of any sort.
If the "plugin" is supposed to be changed from inside your program itself, you should probably just compile when the plugin is changed in your program rather than attempting to see when things get changed.

The only way (that I can think of) to guarantee this 100% is to keep a copy of the original source file along with the compiled version. Then you can do a file comparison between the original and the new one to see if they've changed, and if so, perform your compilation.

My question to you is, what would be an effective way, to track which source file belongs to which assembly, and an effective way to track whether a source file has been changed since last load or not
One way could be to:
Use FileSystemWatcher to track file changes
Keep an list(or table) to maintain source-assembly file associations.

Related

Access other files in VS Language Service (Visual Studio Extensibility)

I'm writing custom language service as described in
https://msdn.microsoft.com/en-us/library/bb166533.aspx
Now I'm writing code for AuthoringScope (https://msdn.microsoft.com/en-us/library/microsoft.visualstudio.package.authoringscope.aspx) My problem is in GetDeclarations method.
I have access to text of current file via ParseRequest.Text property.
It allows me to list all methods and variables in my file but how can I access other files content? I need to get access to other file content for building AST tree of this file but I don't know how can I do this.
Personally I find the MPF "helper" classes (like AuthoringScope) to be a bit restrictive, and implement everything manually (which, I admit, does take more time, but is a lot more flexible in the end).
In any case, it sounds like your language (like most!) has dependencies between files at the semantic parsing level. This means you'll either have to:
a) reparse a lot of text all the time, which is likely too slow in large projects
or b) maintain a global aggregate parse of a project's files, and update it dynamically when the files (or the project's properties) change
b) is obviously a lot harder, but is almost certainly the best way to do it. A general outline would be to discover all projects after a solution is opened via EnvDTE, parse them all (discover all files in each project, again via EnvDTE), and store everything in some sort of indexable data structure so that you can do fast queries against it (for semantic syntax highlighting, go to definition, etc.). Then you need to listen for changes everywhere and reparse appropriately -- you'll need to check for solution open/close (IVsSolutionEvents), projects being added/removed/renamed/unloaded/loaded (IVsSolutionEvents/IVsSolutionEvents4), files being added/removed/renamed (IVsHierarchyEvents), files being edited (IVsTextViewCreationListener + ITextBuffer.Changed), and project configurations changing (IVsUpdateSolutionEvents, IVsHierarchyEvents).
Whether you choose a) or b), you still need to be able to check if a file is opened in the editor (potentially with unsaved changes) or not. You can check if a file is already open in the Running Document Table (but don't forget to normalize the path first using Path.GetFullPath()) via the IVsRunningDocumentTable service, which will return an IntPtr to the document data, which can be coaxed into yielding an ITextBuffer for the file, which contains the text (and entire buffer history!) of the file. Of course, if it's not open you'll have to read it from disk.

Best way to track files being moved (possibly between disks), VB.NET (or C#)

I am developing a "dynamic shortcutting" application which creates special shortcut files which point to a registry entry rather than an actual file/executable. The registry entry contains the path of the desired file. I want to have a daemon running which watches the linked-to files and updates their registry entries if they are moved or renamed. Renamed I can handle using System.IO.FileSystemWatcher, but what is the best way to handle moved files?
I know this is beyond the basic functions of FSW (despite being a low-level file-system operation). The question is, what is the best way of doing it?
Most posts/articles I have read suggest ways that feel altogether "hacky", which basically involve looking for a delete followed by a create in a new place of a file, and connecting the two by file size, meta-data, time between the delete/create triggers, hashes, etc. This may well be the method I have to resort to, setting up FSWs on all drives. However, I am hoping there might be a better way.
Is it possible to either:
2.1. Listen in to the shell and "hear" move operations?
2.2 Or (even more radical) replace or add something to the shell move operation that either triggers some sort of event or performs the registry-updating task itself, precluding the need for the daemon?
I have a feeling that everyone is going to tell me that 1. is the only course, but I look forward to your suggestions. (answers in VB.NET preferred, but can translate from C# if necessary).
[I'm not sure if this should be appended as an "update" to my original post or posted as a separate answer]
To sum up (all two of) the answers plus my own experimenting (to try to give a definitive answer to this question):
It seems the only high-level (.NET) solution is to use the FileSystemWatcher which does not detect "move" out-of-the-box (despite it being a low-level command). The FSW approach is non-trivial, comparably resource-expensive, sloppy in places (i.e. using timers) and has its limitations and caveats. Nor does it provide a true reflection of "move" - it merely infers it from symptoms that are very likely to be a move (and have the same effect on the file-system in any case) but could theoretically be produced by non-move actions. Also, it appears you have to know what files you want to watch for moves in advance of the move happening, there's no-way of telling as it occurs.
On a lower-level (which would involve C++), one could hook API calls to get a faithful picture of when "moves" are called. This has the advantage that you don't have to decide to watch files in advance, and is also less resource-expensive than listening to "deletes" and "creates" and trying to compare them.
On a systems-programming level (which would involve C++ and could easily break your computer if you didn't know what you were doing) one could build a filesystem filter driver: this would take the concept of detecting moves to a truly anal level, detecting re-allocation of filesystem resources performed even without the kernel.
After some experimenting, here is the general structure of how the FileSystemWatcher approach (or at least the most obvious one to me) works, its quirks and its limitations. [no code atm, it's all pretty integrated into my application and I'm yet to optimise it, but I might add some snippets in here later].
The FileSystemWatcher method (to detect when files are moved or renamed):
.1. FileSystemWatchers.
You will need to create one FSW for each highest-level directory you want to monitor (for example, one for each writable logical drive).
.2. Renamed.
Straightforward renaming of the file is trivially handled.
.3. Moved.
This part is very far from trivial; it basically involves comparing files in three different scenarios.
3.0.1. Deciding if a deleted/moved-from file is the same as a created/moved-to file.
For determining whether a deleted and a created file are a match, filename is useless (can be changed during a move). You could use a mixture of file size and attributes like time created, or even a hash of the entire file. In my particular solution I only needed to watch the movement of specific files "registered" before load-time, so I was able to give these files a unique fingerprint as metadata that I could then use to compare files (this works fine in real-world scenarios, but is easy to break maliciously in testing, which disappoints me as a perfectionist.)
3.0.1.1. When to read filesize/attributes/take hash?
Before I came up with the static fingerprint idea, I was testing my code with a simple filesize + creation date validation check. I quickly realised though that I had to have a note of the filesize and creation date (or hash or whatever else you want to use) of the deleted file BEFORE it signals as "deleted", because you can't check the size of a file that doesn't exist. If (like me) you know the files you want to watch in advance, then you need to read in those values before you enable the FileSystemWatchers; you also need to listen for "change" events on those files to update the values of filesize and creation date, take a new hash etc. This then begs the question: what do you do if you DON'T know what files you are interested in watching to see if they move? What if you only know you are possibly interested in knowing if they've moved when they "delete"? That, unfortunately, is beyond me (it wasn't something I had to deal with.) Unless you can come up with a solution to this problem, there is zero point in continuing with the FileSystemWatcher approach. Furthermore, I would conjecture (though could very easily be wrong) that there is no high-level solution that will meet your needs. If you do however come up with a solution (please post it below/comment on this post/edit it in here on this post), I have made the rest of this compatible.
3.1. Scenario 1: Direct moving of the file itself.
Upon the "delete" of a specific file being detected, you need to start listening for a "create" of a congruous file. Rather than listening indefinitely for the matching "create" of a file that might just have been deleted (which in reality involves inspecting every file created in the directory), you can use a timer to start and stop a "listening" flag (practical, but from a purist point of view a little arbitrary), deciding that after e.g. 1000ms with no appropriately matching create it's likely there won't be one.
3.2.0. A common misconception.
A lot of people seem to be under the impression, after glancing at the docs, that moving or renaming a folder triggers a rename for all their subfiles and subfolders rather than a delete and a create. In actual fact what the docs say is:
If you cut and paste a folder with files into a folder being watched, the FileSystemWatcher object reports only the folder as new, but not its contents because they are essentially only renamed.
(i.e. only the top folder throws rename or create/delete and the subfiles/subfolders throw NOTHING). Meaning if you want to know when and where a certain file is moved, you have to listen out for each and every of its ascendent folders as well.
3.2.1. Scenario 2: Renaming of a containing folder.
In my solution, because I knew all the files I was watching, whenever one of my FileSystemWatchers reported a rename of a folder rather than a file (the portion of the string after the last "/" will contain no ".") I checked each of my watched files to see if their paths were in that directory and if so, changed the beginning of the filepath to the path of the new directory et voila!, I knew where my files had been moved to. If you do not now in advance what files you are looking for, then you will have to recursively search through everything in every folder that throws a "rename".
3.2.2. Scenario 3: Moving of a containing folder.
This one feels like a slap in the face: in order to build your move-detection routine, you have to be able to detect moves. Here folders will throw a "delete" followed by a "create". In my case the solution just recycles the techniques in 3.1 and 3.2.1: when a folder "delete" is detected, I check to see if it contains any of my watched files. If it does, I set a "listen" flag (and a timer to snuff it) and check the subdirectory path of my file in the old folder against every new folder "create" that is detected to see if it points to a file with the desired fingerprint. If it does, I now have the old and new paths of the file and have detected the move. If you don't know what files to watch for, you may have to validate folder moves by comparing size on disk and number of subfiles/subfolders between "deleted" folder and "created" folders to confirm a folder has moved first, then search the folder recursively for the files you're interested in.
3.3. FURTHER COMPLICATION: Cross-drive moving of large files.
This is a problem I fortunately didn't run into (because I was only comparing fingerprint metadata, and didn't need access to files); however moving large files between drives (which transfer in stages, triggering a create event then a series of change events) can cause real headaches.
3.3.1. Headache 1: The "create" fires when the destination file is incomplete.
This means comparing its size to a "deleted" file will produce a false negative. You can't even take a hash of the first part of the file to indicate to your program that this "might" be the deleted file, because the move operation will have the file access permissions locked down. You just have to try and tell if the created file might still be moving and wait for it to finish.
3.3.2. Headache 2: No sure way to "tell" that the created file is still being moved.
Some have suggested checking the file access permissions on the created file, but they might be indistinguishable from those on a file created and still in use by any random application. Others have suggested setting short time-limited listen flags for "changes" on the file, but again this is indistinguishable from a file being modified by an application. In fact if the file happened to be a log file constantly and rapidly being updated by some process, then waiting for "changes" to the file to timeout might never end.
3.3.3. Headache 3: (UNTESTED) possibly these sort of moves "delete" the file after "creating" the destination file*.
It makes sense that this would be the case, though I haven't tested it. [if anyone does know, feel free to edit (or delete) this section appropriately]
3.4. A philosophical quandry: are two identical files the same?
This is a very pedantic and arbitrary thought-experiment, but say you have two drives, each with an identical copy of File.txt. You run a batch file that deletes the copy on the first drive then immediately makes a copy of the file on the second drive into the same folder on the second drive and names it Copy of File.txt. Unless you are using fingerprints, your code will identify a delete and then a create of an identical file and be unable to distinguish what happened from a move (with renaming) of the file from the first drive to the second. The final state of the filesystem is identical in both cases so it shouldn't cause your application to behave unexpectedly, but art thou really content to call that a "move" based purely on isomorphism? (especially when you know the kernel sees it differently)?
Using high-level unrestricted api provided by C# - no, you cant. Use FileSystemWatcher.. On same drive operation of moving file is not "delete and create" - it's "rename".
If you can/want to go into lower-level, then you can hook MoveItem and MoveItems of IFileOperation shell's interface, and MoveFile from Kernel32.dll... It will work with most of apps, but require expansion for security rights for your application, that mostly unacceptable in corporative environment..
The task has two flaws that make it hard to implement: (a) move operation across the disks is actually a sequence of read/write operations followed by deletion rather than move. And during those read/write operations there can be some transformation of data in place ; and (b) moving can be performed not by just a shell.
What you can do is employ a filesystem filter driver to intercept file operations right when they take place. Then you need to detect the sequence of read and write operations performed by the same process over your file. I.e. if your code detects, that the file is read sequentially (NOTE: some copying tools can read the file in multiple threads in parallel) and then write similar blocks of data to the other file AND after reading everything the source file is deleted AND the complete file contents have been written to the other place, then you can guess that you have come over file move operation.
Bump & update: This may well be against the rules of StackOverflow, but I would like to point out to the many people landing on this page (and the myriad similar questions on SO) that I have started a feature request on MicroSoft UserVoice to add MOVE detection to FileSystemWatcher. The best solution in the long term, rather than trying to work around the problem, might be to petition MicroSoft to fix it. If you have come here because you too need a solution to this problem, please consider clicking here and voting for this feature.

How can I programmatically add content to new/existing executable using c# and possibly make the solution work on a mac?

I've seen a number of variations on this question and im not sure if this question has been completely duplicated.
I would like to be able to at run-time run an existing executable (SOURCE exe) and have it:
1) take an existing TARGET exe at run time and add content of any size and type to the TARGET exe (pdf, image, word, excel file type, etc)
2) be able to run the modified TARGET exe so that when the TARGET exe is run, it will find the embedded content inside of itself and copy the content to the hard drive and then run the program associated with the content (foe example, run excel on a copied xls file)
I've seen examples where you embed resources at compile time in visual studio but I want to do this at run-time in code (c#, java, whatever works). Either the host TARGET exe needs to already exist and content should be added to it OR the exe will need to be generated from scratch at run-time and content again added to it.
I also would prefer not to use any of the cmd-line tools that visual studio or any other tool would run behind the scenes (if possible) to create an exe to minimize the enduser needing to download any more libraries/sdks than necessary.
This product is in line with what i want to do
http://www.boomeranglistbuilder.com/instructions/usingsoftware.php
(I want to improve upon it) :)
Lastly it'd be great if the solution could be cross platform compatible (doubt it though)
Could this be done in java?
I've seen the window library resource method updateresource method mentioned in my searches but I'm not sure if that would completely fit my situation. can anyone comment?
I hope my question is clear. Please let me know.
Any help would be graciously appreciated.
Thank you,
Carlos
I think that it's true for most binary file formats (including the executables), appending data to a well-formed file will not affect the usage of the file, the way it is typically interpreted by most programs. You could, maybe, take advantage of this.
To embed, you'll need to take your (existing) target executable and simply append some binary data to it. That data will have two parts:
A magic word (to denote the presence of an appended resource)
The resource itself.
So, this:
[target executable data]
Becomes this:
[target executable data]
[magic word]
[resource]
To read the resource from the target executable, simply have that executable open itself, search for the magic word and, if it's present, start reading the resource appended after it.
This is what WinRAR does (or at least did four years ago, when I last checked) to recognize the archives inside of its self-extracting files.

How to include source code in dll?

Short version:
I want my program to be able to (read-only-)access its own source code during runtime. Is there a way to automatically package the source code into the dll during compilation?
Long version:
The background is that when an exception occurs, I want to automatically create a file containing some details of what happened. This file should, among other things, include the source code of the function that caused the problem. I want to send this file to other people by email, and the receiver will most likely not have (or not want to install) Visual Studio, so anything using symbol servers and the likes is out of question. It needs to be a plain text file.
Ideally I would somewhere find the related source code files and just copy out the relevant lines. You can safely assume that as long as I have a folder containing the entire source code, I will be able to locate the file and lines I want.
The best idea I could come up with so far -- and I have not looked into it in much detail because it seems messy to no end -- is to modify the MSBuild files to create a .zip of the source during compilation, and require .dll and .zip to reside in the same folder.
Most of the similar-sounding questions on stackoverflow that I found seem to deal with decompiling .dll files, which is not what I want to do. I have the source code, and I want to ship it together with the .dll in a convenient way.
Update: The really long version
Seems some people are seriously questioning why I would want to do that, so here's the answer: The main purpose of my software is testing some other software. That other software has a GUI. For an easy example, let's say the other software were the standard Windows calculator, then my testcase might look something like this:
calculator.Open();
calculator.EnterValue(13);
calculator.PressButtonPlus();
calculator.EnterValue(38);
calculator.PressButtonEnter();
int value = calculator.GetDisplayedValue();
Assert.That(value == 51);
calculator.Close();
These tests are intentionally written in a very human-readable way.
What I want to do when a problem occurs is to give the developer of the calculator a detailed description of how to reproduce the problem, in a way that he could reproduce by hand, without my software. (In this example, he would open the calculator, enter 13, press plus, and so on.)
Maybe another option would be to have each function calculator.Something() write out an information line to a log, but that would a) be a lot more work, b) only include the test run up to the point where it aborted, and c) bear some risk that writing the line is forgotten in one function, thereby giving an incorrect representation of what was done. But I'm open to other solutions than copying source code.
Take a look at this question: Do __LINE__ __FILE__ equivalents exist in C#?
C++ offers macros (__LINE__, __FILE__, and so on) that replace with the representing information during compile time. This means if you write something like this:
throw new CException(__FILE__);
it will compile to something like this:
throw new CException("test.cpp");
resulting in a hardcoded value. The C# compiler does not offer those macros and you are forced to use reflection to get the information about where the exception has been thrown. The way you can do it is described in the question linked above.
If you are not able to supply .pdb symbols then the default behaviour of Exception.ToString() (or StackTrace.ToString()) will not return you the line number, but the MSIL offsets of the operation that failed. As far as I can remember you can use the Stack Trace Explorer of ReSharper to navigate to the representing code (Not 100% sure about that, but there also was a question here on stackoverflow that mentioned this fact).
You can include copies of the source files as resources.
In the project folder, create a subfolder named Resources. Copy the source files there.
Create in the project a resource file, and then include the source copies you made into it.
Setup a pre-build event to copy the actual source files to Resources folder, so you always have updated copies. In the example I created, this worked well:
copy $(ProjectDir)*.cs $(ProjectDir)Resources
In your code, now you can get the content of the files like this (I suppose the name of the resources file is Resource1.resx:
// Get the source of the Program.cs file.
string contents = Resource1.Program;
The project ended up like this:
Yes, I also recommend packing up the sources in a .zip or whatever during MSBuild, and packaging that .zip with your application/dll. In runtime, when an exception occurs, you get the file and method name like Aschratt describes, extract the file from the .zip and find the method in it.

How do I remove unnecessary resources from my project?

I am working with a very big project (a solution that contains 16 projects and each project contains about 100 files).
It is written in C++/C# with Visual Studio 2005.
One of the projects has around 2000 resources out of which only 400 are actually used.
How do I remove those unused resources?
I tried to accomplish the task by searching for used ones.
It worked and I was able to build the solution, but it broke at runtime.
I guess because enums are used. (IMPORTANT)
How can I make sure that it doesn't break at runtime?
EDIT:
I think one method could be to generate the resource (that is not found) on the fly at runtime (somehow).
But I have no idea about ... anything.
NOTE: It's okay if a few unnecessary resources are still there.
What I would do is write a custom tool to search your source code.
If you remove a resource ID from a header file (i.e. possibly called resource.h) and then recompile and get no warnings: then that's a good thing.
Here is how I would go about writing the app. Take as input the resource file (resource.h) you want to scrutinize. Open the header file (*.h) and parse all the resource constants (Or at least the onces you are interested in). Store those in a hash table for quick look up later.
For each code file in your project, search the text for instances of each of your resource ID's. When a resource ID is used, increment the value in the hash table otherwise leave it at zero.
At the end, dump all the resource ID's that are zero out a log file or something. Then test that indeed you can remove those specified resource ID's safely. Once you do that, then write another tool that removes the specified resource ID's given the results of your log file.
You could write such a tool in perl and it would execute in about 0.3 seconds: But would take days to debug. :)
Or you could write this in .NET, and it would execute a little slower, but would take you an hour to debug. :)
You can use third party plug-in for Visual Studio as ReSharper. This add-in will analyze your C# code and point out unused resources. But it only works with C#.
For C++ projects, check out The ResOrg from Riverblade.
"The Resource ID Organiser (ResOrg for short) is an Add-in for Visual C++ designed to help overcome one of the most annoying (and unnecessary) chores of developing/maintaining Windows applications - maintaining resource symbol ID values"
http://www.riverblade.co.uk/products/resorg/index.html
I've never had one that bad. My method in compiled programs is to use a REXX script which emulates GREP looking for references to source that I suspect is not being used, remove them from the program and see what breaks. I use the REXX script because I can pre-filter the list of files I want to search. Which allows me to do a search across folders and computers.
If your code contains dynamic loading of resources (e.g. via strings) at runtime, then there is no way to automatically determine which resources can be safely removed from the source. A dynamic loading statement could load any resource.
Your best bet is to start with your trimmed down version of the app, run it, and identify which resources are missing when you test it. Then add them back in and retest.
You may want to take a look at the tool Reflector (free), not to be confused with ReSharper (expensive). It can show you which DLLs are dependent on another. Then if you want you may be able to remove the DLL that is not being referenced by anything else. Watch out if you are using dependency injection or reflection which then could break your code without your knowledge.
Reflector:
http://www.red-gate.com/products/reflector/.
This add-in draws assembly dependency graphs and IL graphs:
http://reflectoraddins.codeplex.com/Wiki/View.aspx?title=Graph.
In the "Resources View" of the Solution Explorer, right-click and select "Resource Symbols". Now you get a list where you can see which resources constants are used in the .RC-file. This help you might be a bit on the way to cleanup your Resource.h (although it does not show you which resources are not used in the actual C++ code).
Maybe Find Unused Resources in a .NET Solution helps here? Basically, you'll have to check which resources are used (e.g. by comprehensive code coverage checks) and remove the unused ones.
And probably you should not be afraid by using the trail-and-error approach to cleaning up.
In the Solution Explorer, right click and on a Reference and click on the menu item Find Dependent Code.
If it can't find any dependent code then you can remove this reference from the project. (The Remove operation is also under the right-click menu.)
EDIT: For a large project, the Find Dependent Code operation will take a long time. So since you have 2000 resources and most likely value your time this probably is not a viable option....
For C++ resources, did you try right-clicking the project in "Resource View" and then deleting the ones which do not have a tick mark next to them? It is unsafe to delete unused dialog resources since they are referenced as "enum"s in code (like the following).
enum { IDD = IDD_ABOUTBOX };
..however for all the others it should be safe.

Categories

Resources