How can I patch .NET assemblies?

How can I patch .NET assemblies? - c#

If I compile a C# project twice, I will get two assemblies. These assemblies aren't exactly the same (using a binary diff). I can think of reasons why this is so, but the fact remains that the source for the two assemblies are identical.
I am concerned with creating a patch between these assemblies, and applying the patch on a customer machine.
Does anyone know of a library (preferably .NET) or tool with which I can create and apply patches?
Ideally it should also handle small changes, like changing dependencies on a project level or tweaking a few lines in the source. It does not have to be able to cope with larger changes, because I am happy to replace assemblies in full in this case.
Update:
I think a bit more background might help clarify what I'm getting at. I've got a continuous integration server building my application. I change the file and assembly version to reflect the build number and version I'm building. I could have done it differently, but this is the option I like.
This causes assembly references to change when I build my application, but I'm happy with that. I'm now concerned with distributing an update to a previously released version. I build updates using InstallShield. The assemblies available to InstallShield are reduced to small patches. These do not increase the size of the update by a lot.
My application has a few data files, which are basically encrypted archives containing assemblies. InstallShield does not have access to these assemblies or understand my archive. It does not know how to decrypt and extract it. I've written my own patch routine which finds changed files in the previous archive and replaces them with the new updated version. Basically a search and replace strategy.
The one archive contains almost a hundred (and growing) of these assemblies and some of these assemblies contain large data files as resources. These assemblies are dependent on assemblies which are part of the application. They are also compiled during the continuous integration build. Not compiling them during the build will leave room for dependency issues and I'm not willing to try and manage that. Every time I create a patch, all of these assemblies are included. I'm now looking at options to decrease the patch size by generating patches for these assemblies.

I'd recommend replacing your assemblies. Patching has always been a little problematic - at least in comparison to replacement. Assembly files tend to be fairly small, so replacing them is usually not a very large task.
Typically, the reason that assemblies change if the source is identical typically has to do with one of a few things. Your project may have changes, but more likely, one of the assembly's referenced assemblies has changed. In that case, you'll want to be distributing the full set of assemblies, including the referenced assembly, or you'll get versioning issues and breakage on the customer's machines.
Creating a new installable is usually much simpler and safer than trying to patch a machine in place.

This is not .net but should be able to generate a patch file for you -
http://www.tibed.net/vpatch/
I used to use it for gaming and maps e.t.c. but it should work with any two files.

Related

C# & VS2015: How to save time when compiling?

So, I have a BIG project with lots of stuff in it, but when for example I fix the code, VS2015 compiles the whole project again which takes lots and lots of time. Can I somehow only compile the file I edited?
EDIT: I have one solution, in that I have one project, in that project I have lots of files.

The Build process is smart, it will skip projects that haven't changed and for which the dependencies also haven't been changed. If you change something inside a library that is used by more or less the entire solution, then there just is no alternative but to rebuild all that was (potentially) touched.
You could try tuning the Dependencies yourself, right-click on the Solution and select 'Project dependencies...'. But you can't remove Dependencies that are needed or inferred.

Get an SSD as build drive. A fast one.
All the other tips given here are a given - build is smart, so it will only recompile what needs to be recompiled. This is hardly a help, though, if you have a base library that triggers dozens of projects to update.
Compilation is IO limited, so a SSD helps
Otherwise it may be time to destroy that large solution and generate inernal NUGET packages of base libraries. This can decouple recompilations of base libraries and actual applications. This is particularly useful if you do maintenance on the base libraries.

Plugin Situation: What to do with dependent libraries?

I have a MEF-based application which uses adapters to process files. It uses configuration files to determine which directories to watch and which adapter to use to process each type of file. Plugins take the form of a .dll that implements a common interface.
Each .dll requires its own set of dependent libaries. For instance, plugin1.dll might need to use apilibrary.dll and xmllibrary.dll. It is also possible that at a later date I might want to add plugin2.dll, and plugin2.dll might use xmllibrary.dll as well. These dependent libraries are updated regularly, so I can't count on plugin2.dll using the exact same version of xmllibrary.dll used in plugin1.dll.
I'd like to compile each plugin to one .dll file that invisibly includes within itself all of its dependent libraries, which seems like one way to solve this problem. Alternately, I'd like to figure out how each .dll file can look for its dependent libaries in a subfolder, which I believe would also reduce the possibility of versioning conflicts. Or maybe there's a dead simple solution to this problem that I haven't even considered (which is always very, very likely).
Any thoughts?

You should probably try to get this to work with standard .NET loading rules. However, if you do need to control exactly how assemblies are loaded and which versions are loaded, this blog post shows how: Using Loading contexts effectively

I guess you need to weigh up deployability vs. maintenance. The simple solution is to use a tool called ILMerge. ILMerge takes your project output and can take other assemblies and merge them together. This enables you to wrap up all of the assemblies that your plugin is dependent on, and merge them into a single assembly. Optionally you can do things like re-signing with your public key, etc. Here is a good read: Leveraging ILMerge to simplify deployment and your users experience by Daniel Cazzulino.
But while that is good, what happens if a new version of the referenced assembly is distributed that corrects bugs in that which you have embedded? By the rules of Fusions assembly loader, when it loads the types from your referenced assembly, it will see that they have already been loaded, so there is no reason for it to load the updated version. This would then mean you need to recompile your plugin and merge the newer referenced assembly again.
My question would be, is it really that important to ensure a specific version is used? If a newer version provides an updated implementation (that doesn't break backwards compatibility) then surely this should benefit all plugins that need to reference it?
As for as how assemblies are loaded in reference to each other, have a read of Understanding .Net Assemblies and References, which is an invaluable piece of information.

MEF uses standard .NET assembly loading, and everything's loaded in a single AppDomain. You have very little control over how dependencies are loaded - as they just get loaded automatically by the CLR when the assembly is injected via MEF. Normal CLR assembly loading rules apply when using MEF, so dependencies will be loaded as if they were a dependency of your application - no matter where they're located or referenced.
For the most part, if the plugins and their dependencies are properly written, you most likely will not need to worry about this. As long as the versioning in the dependencies is correct, it will likely just work.

Svn externals and c# assemblies - incompatible?

something that should be so simple in .net seems to be oh-so-hard.
I have a project called MyExtenders, containing a few simple extenders to basic types.
Many projects use MyExtenders - and so in traditional svn checkout and build approach I add MyExtenders as an svn:external with the revision locked to whichever it was last built and tested at.
Now if I have two projects both requiring MyExtenders added to the same solution it all falls in a heap. I cannot add both MyExtenders to the solution - so I have to use just one - which in the case of different revisions means re-testing the older project with it.
A diagram possibly best explains the dependencies:
SolutionA
->ProjectA
->->MyExtenders r350 (svn:externed by ProjectA)
->ProjectB
->->MyCryptography r800 (svn:externed by ProjectB)
->->->MyExtenders r800 (svn:externed by MyCryptography)
Delphi/C work with the above just fine - all references are from their own project folder.
VS insists on losing the directory structure and flattening the above to:
SolutionA
->ProjectA (refers MyExtenders)
->ProjectB (refers MyCryptography)
->MyCryptography r800 (refers MyExtenders)
->MyExtenders r350 || r800 - my choice
And me being forced to modify one of the projects to refer to a different MyExtenders, and a different revision at that.
Clearly I'm doing it all wrong.. but how do you do it right?

There really is no way around this: if you have two different projects depending on different versions of the same assembly, you are bound to have conflicts regardless of how you manage the inter-project dependencies. To see why this is, imagine that all your source conflicts could be solved somehow - now what will you do upon deployment? Which assembly version of the dependency gets loaded? Whichever it is, it will likely break the depending assembly which needs the other version.
If you have a design which requires a shared library among various subsystems, and those subsystems live in the same process (ok, technically, the same AppDomain), you need to have the same assembly version for both.
This problem goes away if you can get the depending assemblies separated by a boundary, such as a service interface or remoting channel. Then you can version the dependencies independently. Visual Studio will not like having two projects in one solution with the same name, however, so the only way around this is to copy one of the project files, rename it, and load it into the solution.

How do I work with shared assemblies and projects?

To preface, I've been working with C# for a few months, but I'm completely unfamiliar with concepts like deployment and assemblies, etc. My questions are many and varied, although I'm furiously Googling and reading about them to no avail (I currently have Pro C# 2008 and the .NET 3.5 Platform in front of me).
We have this process and it's composed of three components: an engine, a filter, and logic for the process. We love this process so much we want it reused in other projects. So now I'm starting to explore the space beyond one solution, one project.
Does this sound correct? One huge Solution:
Process A, exe
Process B, exe
Process C, exe
Filter, dll
Engine, dll
The engine is shared code for all of the processes, so I'm assuming that can be a shared assembly? If a shared assembly is in the same solution as a project that consumes it, how does it get consumed if it's supposed to be in the GAC? I've read something about a post build event. Does that mean the engine.dll has to be reployed on every build?
Also, the principle reason we separated the filter from the process (only one process uses it) is so that we can deploy the filter independently from the process so that the process executable doesn't need to be updated. Regardless of if that's best practice, let's just roll with it. Is this possible? I've read that assemblies link to specific versions of other assemblies, so if I update the DLL only, it's actually considered tampering. How can I update the DLL without changing the EXE? Is that what a publisher policy is for?
By the way, is any of this stuff Google-able or Amazon-able? What should I look for? I see lots of books about C# and .NET, but none about deployment or building or testing or things not related to the language itself.

I agree with Aequitarum's analysis. Just a couple additional points:
The engine is shared code for all of the processes, so I'm assuming that can be a shared assembly?
That seems reasonable.
If a shared assembly is in the same solution as a project that consumes it, how does it get consumed if it's supposed to be in the GAC?
Magic.
OK, its not magic. Let's suppose that in your solution your process project has a reference to the engine project. When you build the solution, you'll produce a project assembly that has a reference to the engine assembly. Visual Studio then copies the various files to the right directories. When you execute the process assembly, the runtime loader knows to look in the current directory for the engine assembly. If it cannot find it there, it looks in the global assembly cache. (This is a highly simplified view of loading policy; the real policy is considerably more complex than that.)
Stuff in the GAC should be truly global code; code that you reasonably expect large numbers of disparate projects to use.
Does that mean the engine.dll has to be reployed on every build?
I'm not sure what you mean by "redeployed". Like I said, if you have a project-to-project reference, the build system will automatically copy the files around to the right places.
the principle reason we separated the filter from the process (only one process uses it) is so that we can deploy the filter independently from the process so that the process executable doesn't need to be updated
I question whether that's actually valuable. Scenario one: no filter assembly, all filter code is in project.exe. You wish to update the filter code; you update project.exe. Scenario two: filter.dll, project.exe. You wish to update the filter code; you update filter.dll. How is scenario two cheaper or easier than scenario one? In both scenarios you're updating a file; why does it matter what the name of the file is?
However, perhaps it really is cheaper and easier for your particular scenario. The key thing to understand about assemblies is assemblies are the smallest unit of independently versionable and redistributable code. If you have two things and it makes sense to version and ship them independently of each other, then they should be in different assemblies; if it does not make sense to do that, then they should be in the same assembly.
I've read that assemblies link to specific versions of other assemblies, so if I update the DLL only, it's actually considered tampering. How can I update the DLL without changing the EXE? Is that what a publisher policy is for?
An assembly may be given a "strong name". When you name your assembly Foo.DLL, and you write Bar.EXE to say "Bar.EXE depends on Foo.DLL", then the runtime will load anything that happens to be named Foo.DLL; file names are not strong. If an evil hacker gets their own version of Foo.DLL onto the client machine, the loader will load it. A strong name lets Bar.EXE say "Bar.exe version 1.2 written by Bar Corporation depends on Foo.DLL version 1.4 written by Foo Corporation", and all the verifications are done against the cryptographically strong keys associated with Foo Corp and Bar Corp.
So yes, an assembly may be configured to bind only against a specific version from a specific company, to prevent tampering. What you can do to update an assembly to use a newer version is create a little XML file that tells the loader "you know how I said I wanted Foo.DLL v1.4? Well, actually if 1.5 is available, its OK to use that too."
What should I look for? I see lots of books about C# and .NET, but none about deployment or building or testing or things not related to the language itself.
Deployment is frequently neglected in books, I agree.
I would start by searching for "ClickOnce" if you're interested in deployment of managed Windows applications.

Projects can reference assemblies or projects.
When you reference another assembly/project, you are allowed to use all the public classes/enums/structs etc in the referenced assembly.
You do not need to have all of them in one solution. You can have three solutions, one for each Process, and all three solutions can load Engine and Filter.
Also, you could have Process B and Process C reference the compiled assemblies (the .dll's) of the Engine and Filter and have similar effect.
As long as you don't set the property in the reference to an assembly to require a specific version, you can freely update DLLs without much concern, providing the only code changes were to the DLL.
Also, the principle reason we
separated the filter from the process
(only one process uses it) is so that
we can deploy the filter independently
from the process so that the process
executable doesn't need to be updated.
Regardless of if that's best practice,
let's just roll with it. Is this
possible?
I actually prefer this method of updating. Less overhead to update only files that changed rather than everything everytime.
As for using the GAC, whole other level of complexity I won't get into.
Tamper proofing your assemblies can be done by signing them, which is required to use the GAC in the first place, but you should still be fine so long as a specific version is not required.
My recommendation is to read a book about the .NET framework. This will really help you understand the CLR and what you're doing.
Applied Microsoft .NET Framework Programming was a book I really enjoyed reading.

You mention the engine is shared code, which is why you put it in a separate project under your solution. There's nothing wrong with doing it this way, and it's not necessary to add this DLL to the GAC. During your development phase, you can just add a reference to your engine project, and you'll be able to call the code from that assembly. When you want to deploy this application, you can either deploy the engine DLL with it, or you can add the engine DLL to the GAC (which is another ball of wax in and of itself). I tend to lean against GAC deployments unless it's truly necessary. One of the best features of .NET is the ability to deploy everything you need to run your application in one folder without having to copy stuff to system folders (i.e. the GAC).
If you want to achieve something like dynamically loading DLL's and calling member methods from your processor without caring about specific version, you can go a couple of routes. The easiest route is to just set the Specific Version property to False when you add the reference. This will give you the liberty of changing the DLL later, and as long as you don't mess with method signatures, it shouldn't be a problem. The second option is the MEF (which uses Reflection and will be part of the framework in .NET 4.0). The idea with the MEF is that you can scan a "plugins" style folder for DLL's that implement specific functionality and then call them dynamically. This gives you some additional flexibility in that you can add new assemblies later without the need to modify your references.
Another thing to note is that there are Setup and Deployment project templates built into Visual Studio that you can use to generate MSI packages for deploying your projects. MSDN has lots of documentation related to this subject that you can check out, here:
http://msdn.microsoft.com/en-us/library/ybshs20f%28VS.80%29.aspx

Do not use the GAC on your build machine, it is a deployment detail. Visual Studio automatically copies the DLL into build directory of your application when you reference the DLL. That ensures that you'll run and debug with the expected version of the DLL.
When you deploy, you've got a choice. You can ship the DLL along with the application that uses it, stored in the EXE installation folder. Nothing special is needed, the CLR can always find the DLL and you don't have to worry about strong names or versions. A bug fix update is deployed simply by copying the new DLL into the EXE folder.
When you have several installed apps with a dependency on the DLL then deploying bug fix updates can start to get awkward. Since you have to copy to the DLL repeatedly, once for each app. And you can get into trouble when you update some apps but not others. Especially so when there's a breaking change in the DLL interface that requires the app to be recompiled. That's DLL Hell knocking, the GAC can solve that.

We found some guidance on this issue at MSDN. We started with two separate solution with no shared code, and then abstracted the commonalities to a shared assemblies. We struggled with ways to isolate changes in the shared code to impact only the projects that were ready for it. We were terrible at Open/Close.
We tried
branching the shared code for each project that used it and including it in the solution
copying the shared assembly from the shared solution when we made changes
coding pre-build events to build the shared code solution and copy the assembly
Everything was a real pain. We ended up using one large solution with all the projects in it. We branch each project as we want to stage features closer to production. This branches the shared code as well. It's simplified things a lot and we get a better idea of what tests fail across all projects, as the common code changes.
As far as deployment, our build scripts are setup to build the code and copy only the files that have changed, including the assemblies, to our environments.

By default, you have a hardcoded version number in your project (1.0.0.0). As long as you don't change it, you can use all Filter builds with the Process assembly (it only knows it should use the 1.0.0.0 version). This is not the best solution, however, because how do you distinguish between various builds yourself?
Another option is use different versions of the Filter by the same Process. You should add an app.config file to the Process project, and include a bindingRedirect element (see the docs). Whenever the Runtime looks for a particular version of the Filter, it's "redirected" to a version indicated in the config. Unfortunately, this means that although you don't have to update the Process assembly, you'll have to update the config file with the new version.
Whenever you encounter versioning problems, you can use Fuslogvw.exe (fusion log viewer) to troubleshoot these.
Have fun!
ulu

Globalizing runtime generated assemblies

Background
A project installs some files that contain all the elements to define a UserControl - some user source, a CodeCompileUnit for designer code, and a resx file. At runtime, these files are compiled into an assembly and the classes are consumed by our main application (the assembly is only updated when necessary).
Question
The project has to be globalized and as part of that process, there is a need to provide localizations of these files. Two options are either to allow the inclusion of additional resx files for different locales (either within the same files or as additional side-by-side files) that can be compiled into a satellite assembly for the main assembly, or to provide a copy of each full file for each supported language, compiling the appropriate set for the language being supported.
Does anyone have any other options that might be worth considering?
What problems might be inherent in either of the solutions I've proposed?
Constraints/Disclaimer
I am aware that the scenario is less than ideal and that better choices could've been made in some areas (like globalizing from the start), but they cannot be changed at this point in the project. I appreciate any advice, solutions, or leads you can provide. Thanks.

Create a separate satellite assembly for each culture. This has two benefits:
You can build all of the assemblies in one go, and have a definitive file for each version number and filename combination, rather than it also depending on the culture.
You can have multiple assemblies in the same installation, and base the language to use on the system language, or a user preference etc. This will make development and testing significantly easier, as you won't need to keep rebuilding and copying files around just for the sake of changing languages.
It's how .NET i18n is designed to work. While I'm not an expert on .NET i18n ("read Guy Smith-Ferrier's book" is my best advice!) I generally find that frameworks work best when you follow their expected model.
Even if the final part of "building the satellite assembly" is done at runtime (can you do it at install time instead?) you still get the second and third bullet advantages at least. It also means that if you ever do go the more normal route of supplying the satellite assemblies to start with (instead of building them on the user's box) you'll have less to change.
Apologies if I've misunderstood the question though...

If you're not planning on adding additional languages after deployment (at least not without a software update), then I'd favor compiling all the additional RESX files into a satellite assembly that you include. That way, they're not user editable once they're deployed.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.