How do you access the Profiler API from pure managed C# code? - c#

Background
I am developing a library called Harmony that currently uses detouring at the assembler level to monkey patch methods at runtime. This works fine and I got this to work on all combinations of hardware and .NET but it is sort of an ugly hack that does not work when methods are inlined.
Scope
I know roughly about the profiler API and that you can alter the IL before it reaches the JITer. My library already provides a high level way to get the modified IL body so all I need to do is to use the profiler API.
Asking around it was recommended to write small C/C++ modules accessing it but I rather would want to do this from my managed code. Reasons: my library should only be one final dll, I rather not want to deal with C/C++ code, different environments might make this hard.
Question
Is this approach possible? Is there an easier way? Is there a profiler library that hides this implementation and just gives me a high level replacement callback?
Note
Please do not question the motives of my library, I am fully aware that this falls outside of any “normal” C# programming. It’s mainly for patching and modding games and so far with 1500+ stars on GitHub a very successful project. I just want to lift it to an even more compatible level.

Related

Performance - How does linking C-sharp and C++ effect on Performance?

I'm just about to start with my new project. I'd been working with C++(with Qt )and C#. And so I'm pretty much familiar with both the languages.
I've always used them separately, C# for windows and C++ for cross-platform applications. But this time I wanted to do something different. I wanted to link them and use them together.
I'll be using C# for the GUI development and C++ Back-End.
So now, all I wanted to know that how will this effect the performance of my application ?
Best Regards,
Samarth Saxena.
The performance of the interop layer is good enough that it won't change the overall performance -- that will depend on how well you write your code, whether you do useless copies, concatenate strings in a loop when you should be using StringBuilder, etc.
Still, the cost isn't zero for p/invoke and COM interop, so you want to avoid chatty interfaces (e.g. the interop call should fill a buffer with an entire array, rather than forcing you to make a p/invoke call for each item).
The final interop method, C++/CLI "It Just Works", actually can have a negative cost compared to pure C#. That's because it's the method that the .NET runtime uses internally (whenever the metadata in mscorlib.exe has the internalcall flag), and if adding C++/CLI code to your project saves more managed/native transitions inside .NET itself than it adds, it will save time.
If you use Microsoft's "Dot Net" flavor of C++, you will be fine. Performance will generally be as good as it would be if you wrote everything in C# or everything in C++. That's because both languages will run in "managed" mode, and they will share the same runtime. (The Dot Net runtime.)
However, if you want to use managed C# with good old regular (unmanaged / native) C++, you are going to have a certain performance penalty, due to all the managed-to-native and native-to-managed transitions that need to be done when placing calls between the two, and all the marshalling of data that this implies.
Google for "pinvoke" to see what pains people have to go through in order to invoke C++ from C#. Still, it is quite cool that "pinvoke" exists, and it makes things relatively easy.

JSIL vs Script# vs SharpKit

I'm looking at Script#, JSIL and SharpKit as a tool to use to compile C# to Javascript, so I can program the client side functions of AJAX using C# in Visual Studio.
What are the pros and cons of each JSIL, Script# and SharpKit?
My project is a MVC4 project using razor engine and C#, if it matters.
If you're looking to integrate directly with an MVC project, something like Script# or SharpKit or something is probably your best bet - I know for a fact that Script# has stuff built in to make that sort of integration easier, so I would start there.
If you do want to try using JSIL, it probably has the core features you need, but things that you might want - like visual studio integration, automated deployment, etc - are not there. At present it is primarily targeted at cross-compilation of applications, so it does a good job of that but not as good a job of other use cases.
I'll try to give a summary of reasons why you might want to consider JSIL over those other alternatives - I can't really comment on the pros and cons of those alternatives in depth since I haven't used them:
JSIL has extremely wide support for the features available in C# 4. Notable ones (either because other tools don't support them, or they're complicated) include:
dynamic, yield, Structs, ref / out, Delegates, Generics, Nullables, Interfaces, and Enums.
Some of the above, of course, don't have complete support - to get an idea of things that absolutely will work, you can look at the test cases - each one is a small self-contained .cs file that is tested to ensure that JSIL and native C# produce the same output.
The reason for this extensive support is that my goal is for JSIL to enable you to translate a completely unmodified C# application to working JS. For all the demos up on the JSIL site, this is true, and I have a few nearly finished ports of larger real games in the wings for which this is also true.
Another reason is that JSIL makes it relatively straightforward for your C# and your JavaScript to talk.
All your C# types and methods are exposed via an interface that is as javascript-friendly as possible. The JS versions have basic overload resolution and dispatch so that native C# interfaces are callable from script code as if they were native JS in most cases. You don't have to take any steps to specifically tag methods you wish to expose to JS, or give them special names, or anything like that unless you want to.
When you want to call out from C# to JS, you can do it a few ways:
JSIL.Verbatim.Expression lets you insert raw javascript directly into the translated version of a function.
JSIL.Builtins.Global can be combined with dynamic and var to write JavaScript-like code directly in your C# function bodies.
The JSReplacement attribute can be used to replace invocations of a C# function with a parameterized JavaScript expression.
All of the above features can be combined with JSIL's mechanism for altering type information, called Proxies, to allow you to alter the type information of libraries you use, even if you don't have source code, in order to map their methods to JavaScript you've written.
And finally, C# methods that aren't translated to JS produce an empty method called an External that you can then replace with JavaScript at runtime to make it work again. Any External methods that you haven't replaced produce clear warning message at runtimes so you know what's missing.
JSIL makes aggressive use of type information, along with metadata you provide, to try and safely optimize the JavaScript it generates for you. In some cases this can produce better equivalent JavaScript than you would have written by hand - the main area where this is true at present is code that uses structs, but it also can apply in other cases.
For example, in this code snippet, JSIL is able to statically determine that despite the number of struct copies implied by the code, none of the copies are actually necessary for the code to behave correctly. The resulting JavaScript ends up not having any unnecessary copies, so it runs much faster than what you'd get if you naively translated the semantics of the original C#. This is a nice middle ground between writing the naive struct-based thing (Vector2s everywhere!) and going completely nuts with named return value optimization by hand, which, as I've described in the past, is pretty error-prone.
Okay, now for some downsides. Don't consider this list exhaustive:
Large portions of the .NET BCL don't have implementations provided for you by JSIL. In the future this may be addressed by translating the entire Mono mscorlib to JavaScript, but I don't have that working well enough to advocate it as an immediate solution. (This is fine for games so far, since they don't use much of the BCL.) This issue is primarily due to the IP problems related to translating Microsoft's mscorlib - if I could do that legally, I'd be doing it right now - it worked the last time I tested it.
As mentioned above, no visual studio integration. JSIL is pretty easy to use - you can feed it a .sln file to get a bunch of .js outputs automatically, and configure it automatically with a configuration file next to the project - but it's nowhere near as polished or integrated as say, Script#.
No vendor or support staff. If you want a bug fixed yesterday or you're having issues, I'm pretty much your only bet at present (though there are a few prolific contributors helping make things better, and more are always welcome!)
JavaScript performance is a goddamn labyrinth full of invisible land mines. If you just want apps to work, you probably won't have any issues here, but if like me you're trying to make real games run fast in browsers, JavaScript will make your life hell and in some cases JSIL will make it worse. The only good thing I can say here is that I'm working on it. :)
JavaScript minifiers and optimizers like Closure are explicitly not supported, because they require your code generator to jump through a bunch of hoops. I could see this being a real blocker depending on how you intend to use your code.
The static analyzer is still kind of fragile and there are still gaps in the language support. Each big application I port using JSIL usually reveals one or two bugs in JSIL - not huge game breakers, but ones that definitely break a feature or make things run slow.
Hope this information is helpful! Thanks for your interest.
Script# pros:
Free
Open source
Generates clean JavaScript
Script# cons:
Supports a subset of C# 2.0 language only
Can be compiled only in a separate project, cannot mix / re-use code between client and server
Low frequency of version updates
Does not offer support
Limited 3rd party library support, C# API is different than JavaScript API.
Not open source
Debugging in JavaScript only
SharpKit pros:
Commercial product
Supports full C# 4.0 language
High frequency of version updates
Support is available
Client / server code can be mixed and re-used within the same project
Extensive 3rd party library support, maintained as open-source - C# API matches exactly to JavaScript API
Supports basic C# debugging for Chrome browsers
Generates clean JavaScript
SharpKit cons:
Has a free version with no time limit, but limited to small / open-source projects
Not open source (only libraries are open-source)
JSIL pros:
Free
Open-source
JSIL cons:
Converts from IL (intermediate language), not from C#, which means a lower abstraction layer since code is already low-level.
Complex generated JavaScript code - almost like IL, hard to read and debug
Answers to feedbacks:
Kevin: JSIL output is not bad, it's simply generated to achieve full .NET behavior, much like SharpKit's CLR mode. On the other hand, SharpKit supports native code generation, in which any native JavaScript code can be generated from C#, exactly as it would have written by hand.
Sample of SharpKit's clean generated JavaScript code:
http://sharpkit.net/Wiki/Using_SharpKit.wiki
Developer can choose to create more complex code generation and gain more features, like support for compile-time method overloads. When specified, SharpKit generates method suffixes to overloaded methods.
Script# requires .NET 4 in order to run, but it does not support full C# 4.0 syntax, like Generics, ref and out parameters, namespace aliases, etc...
Another alternative is WootzJs. Full Disclosure, I am its author.
WootzJs is open-source and strives to be a fairly lightweight cross-compiler that allows for all the major C# language features.
Notable Language Features Supported:
yield statements (generated as an efficient state machine)
async/await methods (generated as a state machine like the C# compiler)
ref and out parameters
expression trees
lambdas and delegates (with proper capturing of this)
generics support in both the compiler and the runtime (invalidly casting to T will throw a cast exception)
C# semantics (as opposed to Javascript semantics) for closed varaibles
It is implemented using Roslyn, which means it will be first in line to take
advantage of future language improvements, since those will now be implemented via Roslyn itself. It provides a custom version of mscorlib so you know exactly what library functionality is actually available to you in your scripts.
What Are its Downsides?
The Javascript is not intended to look "pretty". It is clearly machine generated, though individual methods should be easy to reason about by looking at them.
Because of its extensive support for core libraries and reflection, the generated output is not the smallest on the block. Minification should produce an ~100k JS file, but minification is not yet supported.
WootzJs unabashedly pollutes native types with functions to encapsulate behavior for those types that would only be found in C#. For example, all the methods of System.String are added to the native Javascript String type.
Little support for binding to 3rd-party Javascript libraries presently exist. (Currently only jQuery)
Comparisons with Other Cross-Compilers:
Script# is very stable and has extensive integration with 3rd party Javascript libraries. Furthermore, it has excellent Visual Studio integration, and it provides a custom implementation of mscorlib. This means that you know precisely what functionality has actually been implemented at the tooling level. If, for example, Console.Write() is not implemented, that method will not be available in your editor.
However, due to its custom parser, it is still stuck in C# 2.0 (without even the generics found in that version of C#). This means that the modern C# developer is giving up an enormous set of language features that most of us depend on without reservation -- particularly the aforementioned generics in addition to lambdas and LINQ. This makes Script# essentially a non-starter for many developers.
JSIL is an extremely impressive work that cross-compiles IL into Javascript. It is so robust it can easily handle the cross-compilation of large 3d video games. The downside is that because of its completeness the resultant Javascript files are enormous. If you just want mscorlib.dll and System.dll, it's about a 50MB download. Furthermore, this project is really not designed to be used in the context of a web application, and the amount of effort required to get started is a bit daunting.
This toolkit too implements a custom mscorlib, again allowing you to know what capabilities are available to you. However, it has poor Visual Studio integration, forcing you to create all the custom build steps necessary to invoke the compiler and copy the output to the desired location.
SharpKit: this commercial product strives to provide support for most of the C# 4.0 language features. It generally
succeeds and there's a decent chance this product will meet your needs. It is lightweight (small .JS files), supports modern C# language features (generics, LINQ, etc.) and is usually reliable. It also has a large number of bindings for 3rd party Javascript librarires. However, there are a surprising number of edge cases that you will invariably encounter that are not supported.
For example, the type system is shallow and does not support representing generics or arrays (i.e. typeof(Foo[]) == typeof(Bar[]), typeof(List<string>) == typeof(List<int>)). The support for reflection is limited, with various member types incapable of supporting attributes. Expression tree support is non-existent, and the yield implementation is inefficient (no state machine). Also, a custom mscorlib is not available, and script C# files and normal C# files are intermingled in your projects, forcing you to decorate each and every script file with a [JsType] attribute to distinguish them from normally compiled classes.
We have SharpKit for two years and I must say that's upgraded the way we write code.
The pros as I see them:
The code is much more structured - we can now developed infrastrcture just like we did in C# without "banging our heads" with prototype.
It is very easy to refactor
We can use Code Snippets which results in better productivity and less development time
You can control the way the JS is rendered (you have several modes to choose from).
We can debug our C# code in the browser (Currently supported on Chrome only, but still :->)
Great support! If you send them a query you get a response very fast.
Support a large number of libraries & easily extensible
The cons:
The documentation is a bit poor, however once you get a hang of it you'll boost your development.
Glad if this could help!
For ScriptSharp, this stackoverflow link could be of help.
What advantages can ScriptSharp bring to my tool kit?
If you have any SVN tool, please download a sample from https://github.com/kevingadd/JSIL, this is a working source code and can help you go miles.

Managed C++ (C++/CLI) vs C#/VB.NET

I have worked extensively with C#, however, I am starting a project where our client wishes all code to be written in C++ rather than C#. This project will be a mix between managed (.NET 4.0) and native C++. Being that I have always preferred C# to C++ for my .NET needs, I am wondering if there are any important differences I may not be aware of between using C# and managed C++?
Any insight into this is greatly appreciated.
EDIT Looking at Wikipedia for managed C++ code shows that the new specification is C++/CLI, and that "managed C++" is deprecated. Updated the title to reflect this.
C++/CLI is a full fledged .NET language, and just like other .NET languages it works very well in a managed context. Just as working with native calls in C# can be a pain interleaving native C++ and Managed C++ can lead to some issues. With that said, if you are working with a lot native C++ code I would prefer to use C++/CLI over C#. There are quite a few gotchas most of which could be covered by do not write C++/CLI as if your were writing C# nor write it as if you were writing native C++. It is its own thing.
I have worked on several C++/CLI projects and the approach I would take really depends on the exposure of different levels of the application to native C++ code. If the majority of core of the application is native and the integration point between the native and managed code is a little fuzzy then I would use C++/CLI throughout. The benefit of the control in the C++/CLI will outweigh its problems. If you do have clear interaction points that could be adapted or abstracted then I would strongly suggest the creation of a C++/CLI bridging layer with C# above and C++ below. The main reason for this is that tools for C# are just more mature and more ubiquitous than the corresponding tools for C++/CLI. With that said, the project I have been working on has been successful and was not the nightmare the other pointed to.
I would also make sure you understand why the client is headed in this direction. If the idea is that they have a bunch of C++ developers and they want to make it simpler for them to move to write managed code I would posit to the client that learning C# may be less challenging then learning C++/CLI.
If the client believes that C++/CLI is faster that is just incorrect as they all compile down to IL. However, if the client has a lot of existing or ongoing native C++ development then the current path may in fact be best.
I've done a project with C++/CLI and I have to say it was an abomination. Basically it was a WinForms application to manage employees, hockey games, trades between teams, calendars etc, etc...
So you can imagine the number of managed controls I had on my forms: calendars / date time pickers, combo boxes, grids etc.
The worst part was to use only C++ types for my back-end, and use the managed types for the front-end. First off you can't assign a std string to a managed string. You'll need to convert everything. Obviously you'll have to convert it back...
Every time I needed to fill a grid, I serialized my C++ collections to something like a vector<std::string>, retrieve that in my UI library and then looped trough that and made new DataGridRow to add them to the grid. Which obviously can be done in 3 minutes with C# and some Linq to SQL.
I ended up with A+ for that application but lets be honest it absolutely sucked. I just can't imagine how pathetic the others app were for me to get that.
I think it would've been easier if i used List<Customer>^ (managed List of some object) in my C++ instead of always converting everything between vectors of strings. But I needed to keep the C++ clean of managed stuff.
/pissedof
From using all three areas (.NET, C++/CLI and C++) I can say that in everyway I prefer using .NET (through C# or VB.NET). For applications you can use either WinForms or WPF (the latter of which I find far better - especially for applications that look far more user friendly).
A major issue with C++/CLI is that you don't have all the nice language features that you get in .NET. For example, the yield keyword in C# and the use of lambda (I don't think that's supported in C++/CLI - don't hold me to that).
There is, however, one big advantage of C++/CLI. That is that you can create a bridge to allow C# and C++ to communicate. I am currently working on a project whereby a lot of math calculations and algorithms have already been written (over many years) in C++, but the company is wanting to move to a .NET-based user interface. After researching into various solutions, I came to the conclusion that C++/CLI was far better for this. One benefit is that it allowed me to build an API that, for a .NET developer, looked and worked just like a .NET type.
For developing an application's front end, however, I would really not recommend C++/CLI. From a usability point of view (in terms of developer time when using it) it just isn't worth it. One big issue is that VS2010 dropped support for IntelliSense for C++/CLI in order to "improve general IntelliSense" (I think specifically for C++). If you haven't already tried it, I would definitely advise checking out WPF for applications.

2 basic but interesting questions about .NET

when I first saw C#, I thought this must be some joke. I was starting with programming in C. But in C# you could just drag and drop objects, and just write event code to them. It was so simple.
Now, I still like C the most, because I am very attracted to the basic low level operations, and C is just next level of assembler, with few basic routines, so I like it very much. Even more because I write little apps for micro-controllers.
But yesterday I wrote very simple control program for my micro-controller based LED cube in asm, and I needed some way to simply create animation sequences to the Cube. So, I remembered C#. I have practically NO C# skills, but still I created simple program to make animation sequences in about hour with GUI, just with help of google and help of the embedded function descriptions in C#.
So, to get to the point, is there some other reason then top speed, to use any other language than C#? I mean, it is so effective. I know that Java is a bit of similar, but I expect C# to be more Windows effective since its directly from Microsoft.
The second question is, what is the advantage of compiling into CIL, and than run by CLR, than directly compile it into machine code? I know that portability is one, but since C# is mainly for Windows, wouldn´t it be more powerful to just compile it directly? Thanks.
1 - diff languages have their pros and cons. There are families of languages (functional, dynamic, static, etc.) which are better for specific problem domains. You'd need to learn one in each family to know when to choose which one. e.g. to write a simple script, I'd pick Ruby over C#
2 - Compiling it to CIL: Portability may not be a big deal.. but to be precise Mono has an implementation of the CLR on Linux. So there. Also CIL helps you to mix-and-match across languages that run on the CLR. e.g. IronRuby can access standard framework libraries written in C#. It also enables the CLR to leverage the actual hardware (e.g. turn on optimizations, use specific instructions) on which the program is run. The CLR on 2 machines would produce the best native code from the same IL for the respective machine.
Language and platform choice are a function of project goal. It sounds like you enjoy system level programming, which is one of the strong points of using C/C++. So, keep writing systems level code if that's what you enjoy.
Writing in C# is strong in rapid business application development where the goals are inherently different. Writing good working code faster is worth money in both man-hours and time to market. Microsoft does us a huge favor with providing an expressive language and a solid framework of functionality that prevents us from having to write low level code or tooling for 95% of business needs.
One important advantage of IL is language independance. You can define modules in project which should be done in C++, some in C# and some in VB.net. All these projects when compiled give respective assemblies(.dll/.exe). This you can use the assembly for C++ project in the c# one and vice versa. This is possible because.. no matter which language (.net supported) you choose.. all compile to the same IL code.
I'm not sure that C# is more effective only because is a Microsoft product. If you use the Visual Studio, or other RAD, some of the code is auto-generated and sometimes is less efficient. Some years ago I was a dogmatic, thinking only C can response all our prayers :-P , but now I think virtual machines can help a lot in the way to optimize code before to execute it (like a RDBMS), storing in caché pieces of code to execute later, etc. Including the possibility to create "clusters" of virtual machines as Terracotta does. At least the benefits of having an extra abstraction layer are bigger that don't have it.
I agree with spoulson. C# is really good at solving business problems. You can very effective create a framework that models your business processes and solve many of those problems with object orientation and design patterns. In that respect it provides much of the nice object oriented capability that C++ has.
If you are concerned with speed, C is the route to go for the reasons that you stated.
Further on the second question: you can run NGEN to generate a native image of the assembly, which can improve performance. Not quite machine code, but since it bypasses the JIT (just-in-time compile) phase, the app will tend to run much faster.
http://msdn.microsoft.com/en-us/library/6t9t5wcf(VS.80).aspx
The Native Image Generator (Ngen.exe)
is a tool that improves the
performance of managed applications.
Ngen.exe creates native images, which
are files containing compiled
processor-specific machine code, and
installs them into the native image
cache on the local computer. The
runtime can use native images from the
cache instead of using the
just-in-time (JIT) compiler to compile
the original assembly.
"is there some other reason then top
speed, to use any other language than
C#?"
I can think of at least four, all somewhat related:
I have a a large current investment in 'language X', and I don't have the time or money to switch to something else. (Port an existing code base, buy/acquire/port libraries, re-develop team skills in C#, learn different tools.)
An anticipated need to port the code to a platform where C# is not supported.
I need to use tools that are not available in C#, or are not as well supported. (IDE's, alternate compilers, code generators, libraries, the list goes on and on...)
I've found a language that's even more productive. ;-)
"what is the advantage of compiling
into CIL, and than run by CLR, than
directly compile it into machine
code?"
It's all about giving the runtime environment more control over the way the code executes. If you compile to machien code, a lot becomes 'set in stone' at that time. Deferring compilation to machine code until you know more about the runtime environment lets you optimize in ways you might not be able to otherwise. Just a few off the top of my head:
Deferring compilation lets you select instructions that more closely match your host CPU. (To use 64-bit native instructions when you have them, or the latest SSE extensions.)
Deferring code lets you optimize in ways you might not be able to otherwise. (If you have only one class at runtime that's derived from a specific interface, you can start to inline even virtual methods, etc.)
Garbage collectors sometimes need to insert checkpoints into user code. Deferring compilation lets the GC have more control and flexibility over how that's done.
First answer: C# should be used by default for new projects. There are a few cases where it hasn't caught up yet to C++ (in terms of multi-paradign support), but it is heading in that direction.
Second answer: "portability" also includes x86 / x64 portability, which can be achieved by setting the platform to AnyCPU. Another (more theoretical at this point) advantage is that the JIT compiler can take advantage of the CPU-specific instruction set and thus optimize more effectively.

Recommended migration strategy for C++ project in Visual Studio 6

For a large application written in C++ using Visual Studio 6, what is the best way to move into the modern era?
I'd like to take an incremental approach where we slowly move portions of the code and write new features into C# for example and compile that into a library or dll that can be referenced from the legacy application.
Is this possible and what is the best way to do it?
Edit: At this point we are limited to the Express editions which I believe don't allow use of the MFC libraries which are heavily used in our current app. It's also quite a large app with a lot of hardware dependencies so I don't think a wholesale migration is in the cards.
Edit2: We've looked into writing COM-wrapped components in C# but having no COM experience this is scary and complicated. Is it possible to generate a C# dll with a straight-C interface with all the managed goodness hidden inside? Or is COM a necessary evil?
I'd like to take an incremental
approach where we slowly move portions
of the code
That's the only realistic way to do it.
First, what kind of version control do you use? (If you use branching version control that allows you to make experiments and see what works, while minimizing the risk of compromising your code; others are OK also, but you'll have to be really careful depending on what you are using).
Edit: I just saw you are using SVN. It may be worthwile to move to mercurial or git if you have the liberty to do that (the change provides a quantum leap in what you can do with the code-base).
and write new features into C# for
example and compile that into a
library or dll that can be referenced
from the legacy application.
That's ... not necessarily a good idea. C# code can expose COM interfaces that are accessible in C++. Writing client code in C++ for modules written in C# can be fun, but you may find it taxing (in terms of effort to benefits ratio); It is also slow and error-prone (compared to writing C# client code for modules written in C++).
Better consider creating an application framework in C# and using modules (already) written in C++ for the core functionality.
Is this possible and what is the best
way to do it?
Yes, it's possible.
How many people are involved in the project?
If there are many, the best way would be to have a few (two? four?) work on the new application framework and have the rest continue as usual.
If there are few, you can consider having either a person in charge of this, or more people working part-time on it.
The percentage of people/effort assigned on each (old code maintenance and new code development) should depend on the size of the team and your priorities (Is the transition a low priority issue? Is it necessary to be finished by a given date?)
The best way to do this would be to start adapting modules of the code to be usable in multiple scenarios (with both the old code and the new one) and continue development in parallel (again, this would be greatly eased by using a branching distributed version control system).
Here's how I would go about it (iterative development, with small steps and lots of validity checks in between):
Pick a functional module (something that is not GUI-related) in the old code-base.
Remove MFC code (and other libraries not available in VS2010 Express - like ATL) references from the module picked in step 1.
Do not attempt to rewrite MFC/ATL functionality with custom code, unless for small changes (that is, it is not feasible to decide to create your own GUI framework, but it is OK to decide to write your own COM interface pointer wrapper similar to ATL's CComPtr).
If the code is heavily dependent on a library, better separate it as much as possible, then mark it down to be rewritten at a future point using new technologies. Either way, for a library heavily-dependent on MFC you're better off rewriting the code using something else (C#?).
reduce coupling with the chosen module as much as possible (make sure the code is in a separate library, decide clearly what functionality the module exposes to client code) and access the delimited functionality only through the decided exposed interface (in the old code).
Make sure the old code base still works with the modified module (test - eventually automate the testing for this module) - this is critical if you need to still stay in the market until you can ship the new version.
While maintaining the current application, start a new project (C# based?) that implements the GUI and other parts you need to modernize (like the parts heavily-dependent on MFC). This should be a thin-layer application, preferably agnostic of the business logic (which should remain in the legacy code as much as possible).
Depending on what the old code does and the interfaces you define, it may make sense to use C++/CLI instead of C# for parts of the code (it can work with native C++ pointers and managed code, allowing you to make an easy transition when comunicating between managed .NET code and C++ native code).
Make the new application use the module picked in step 1.
Pick a new module, go back to step 2.
Advantages:
refactoring will be performed (necessary for the separation of modules)
at the end you should have a battery of tests for your functional modules (if you do not already).
you still have something to ship in between.
A few notes:
If you do not use a distributed branching version control system, you're better off working on one module at a time. If you use branching/distributed source control, you can distribute different modules to different team members, and centralize the changes every time something new has been ported.
It is very important that each step is clearly delimited (so that you can roll back your changes to the last stable version, try new things and so on). This is another issue that is difficult with SVN and easy with Mercurial / Git.
Before starting, change the names of all your project files to have a .2005.vcproj extension, and do the same for the solution file. When creating the new project file, do the same with .2010.vcxproj for the project files and solution (you should still do this if you convert the solutions/projects). The idea is that you should have both in parallel and open whichever you want at any point. You shouldn't have to make a source-tree update to a different label/tag/date in source control just to switch IDEs.
Edit2: We've looked into writing
COM-wrapped components in C# but
having no COM experience this is scary
and complicated.
You can still do it, by writing wrapper code (a small templated smart pointer class for COM interfaces wouldn't go amiss for example - similar to CComPtr in ATL). If you isolated the COM code behind some wrappers you could write client code (agnostic of COM) with (almost) no problems.
Is it possible to generate a C# dll
with a straight-C interface with all
the managed goodness hidden inside? Or
is COM a necessary evil?
Not that I know of. I think COM will be a necessary evil if you plan to use server code written in C# and client code in C++.
It is possible the other way around.
Faced with the same task, my strategy would be something like:
Identify what we hope to gain by moving to 2010 development - it could be
improved quality assurance: unit testing, mocking are part of modern development tools
slicker UI: WPF provides a modern look and feel.
productivity: in some areas, .NET development is more productive than C++ development
support: new tools are supported with improvements and bugfixes.
Identify which parts of the system will not gain from being moved to C#:
hardware access, low-level algorithmic code
pretty much most bespoke non-UI working code - no point throwing it out if it already works
Identify which parts of the system need to be migrated to c#. For these parts, ensure that the current implementation in C++ is decoupled and modular so that those parts can be swapped out. If the app is a monolith, then considerable work will be needed refactoring the app so that it can be broken up and select pieces reimplemented in c#. (It is possible to refactor nothing, instead just focus on implementing new application functionality in c#.)
Now that you've identified which parts will remain in C++ and which parts will be implemented in c#, (or just stipulate that new features are in c#) then focus turns to how to integrate c# and c++ into a single solution
use COM wrappers - if your existing C++ project makes good use of OO, this is often not as difficult as it may seem. With MSVC 6 you can use the ATL classes to expose your classes as COM components.
Integrate directly the native and c# code. Integrating "legacy" compiled code requires an intermediate DLL - see here for details.
Mixing the MFC UI and c# UI is probably not achieveable, and not adviseable either as it would produce a UI mix of two distinct styles (1990s grey and 2010 vibe). It is simpler to focus on achieving incremental migration, such as implementing new application code in c# and calling that from the native C++ code. This keeps the amount of migrated c# code small to begin with. As you get more into the 2010 development, you can then take the larger chunks that cannot be migrated incrementally, such as the UI.
First, your definition of modern era is controversial. There's no reason to assume C# is better in any sense than C++. A lot has been said on whether C# helps you better avoid memory management errors, but this is hardly so with modern facilities in C++, and, it's very easy to do mess with C# in terms of resource acquisition timing, that may be dependent on what other programs are doing.
If you move straight from 6 to 2010 you may end up with some messed up project settings. If this isn't a fairly large project, and it's one of few that you need to convert, then that should be fine. Just open it in 2010, and follow the conversion wizard. Make sure to back up your project first, and verify your project settings when you're done.
In my opinion though the best way is to convert it step by step through each iteration of Visual Studio. I had to modernize 1400 projects from 2003 to 2010, and the best way that I found was to convert everything to 2005, then to 2008, and then finally to 2010. This caused the least amount of issues to arise for me.
If you only have 6 and the newest Visual Studio you may end up just having to try and go straight to the new one using the wizard. Expect some manual cleanup before everything builds correctly for you again.
Also, one more time, BACK IT UP FIRST! :)
High-level C++ code calling low-level C# code doesn't look like a good idea. The areas where .NET languages are better, are user interface, database access, networking, XML files handling. Low-level stuff like calculations, hardware access etc. is better to keep as native C++ code.
Moving to .NET, in most cases it is better to rewrite UI completely, using WPF or Windows Forms technologies. Low-level stuff remains native, and different interoperability technologies are used to connect C# and native code: PInvoke, C++/CLI wrappers or COM interoperability. After some time, you may decide to rewrite low-level native components in C#, only if it is really necessary.
About compiling native C++ code in VS2010 - I don't see any problems. Just fix all compilation errors - new compilers have more strict type checking and syntax restrictions, and catch much more bugs at compilation time.
Not sure why so many folks are advocating for COM. If you haven't already got a lot of COM in there, learning how to do it on the C++ side is going to hurt, and then you're using the slowest possible interop from the managed side. Not my first choice.
Ideally you have refactored your UI from your business logic. You can then build a new UI (WPF, WinForms, ASP.NET, web services that support some other client, whatever) and call into your business logic through P/Invoke or by writing a C++/CLI wrapper. #mdma has good advice for you assuming that the refactoring is possible.
However if you were paying me to come in and help you my very first question would be why do you want to do this? Some clients say they don't want to pay C++ devs any more, so they want all the C++ code gone. This is a scary objective because we all hate to touch code that works. Some clients want to expose their logic to ASP.NET or Reporting Services or something, so for them we concentrate on the refactoring. And some say "it looks so 1999" and for them I show them what MFC looks like now. Colours, skinning/theming including office and win7 looks, ribbon, floating/docking panes and windows, Windows 7 taskbar integration ... if you just want to look different, take a look at MFC in VS 2010 and you might not have to adjust any code at all.
Finally to make non-Express versions of VS 2010 affordable look into the Microsoft Partner Program. If you have sold your software to at least 3 customers who still speak to you, and can get through the Windows 7 logo self test (I have got VB 6 apps through that in a day or two) then you can have 5-10 copies of everything (Windows, Office, VS) for $1900 or so a year, depending on where you live.
To start I'd try and keep as much code as possible to avoid a rewrite. I'd also remove all unused code before starting the conversion.
Since VC++ 6.0 Microsoft changed the MFC libraries and the C++ Standard Library.
I recommend to start building your DLLs with no dependencies, then looking at your third party libraries, and then rebuild one dependent DLL/EXE at a time.
Introduce unit tests to make sure the behaviour of code does not change.
If you have a mixed build, using different versions of VC++, you need to guard against passing resources (file handles) between DLLs that use different versions of the VC runtime.
If at all financially possible I would strongly consider just paying the money for the version of Visual Studio that you need because you could very well lose more money on the time you spend. I do not know enough about the express editions to give a good answer on them but when integrating some code from a subcontractor that was written in C++ I used C++ / CLI. You will probably be able to reuse most of your codebase and will be familiar with the language but you will also have access to managed code and libraries. Also if you want to start writing new code in C# you can do that. The biggest problem I had with it was that in VS 2010 there is no intellisense in C++ / CLI.
Visual Studio 6 is legendary for being buggy and slow. Moving into the modern era would best be done by getting a new compiler. What is probably the easiest thing to do is write the legacy app into a DLL, then write your exe into C# and use P/Invoke. Then you never have to touch the old code again- you can just write more and more in C# and use less and less of the old DLL.
If your old code is very heavily OO, you can use C++/CLI to write wrapper classes that allow .NET to call methods on C++ objects, and collect them too if you use a reference counted smart pointer.
You can use C# to write your new components with a COM or COM+ (System.EnterpriseServices) wrapper, which will be callable from your existing C++ code.

Categories

Resources