How I can translate small C++/CLI project to c#
One roundabout, manual way would be to compile your C++/CLI project and open the output assembly in Reflector. Disassemble each class, have it convert the disassembled IL to C#, and save that code off.
As for an automatic way to do it, I can't think of any off the top of my head.
Those things being said, are you sure you really want to convert your project to C#? If your C++/CLI project uses any unmanaged code, you'll have a difficult time coming up with a purely managed equivalent. If the project is more or less composed of pure CLR code, and it was written in C++/CLI for the sake of being written in C++/CLI, I can understand wanting to convert it to C#. But if there was a reason for writing it in C++/CLI, you may want to keep it that way.
IMHO, line by line is the best way. I've ported several C++ style projects to a managed language and I've tried various approaches; translators, line by line, scripting, etc ... Over time I've found the most effective way is to do it line by line even though it seems like the slowest way at first.
Too much is lost in a translator. No translator is perfect and you end up spending a lot of time fixing up the translated code. Also, translated code as a rule is ugly and tends to be less readable than hand crafted code. So the result is a fixed up, not very pretty code base.
A couple of tips I have on line by line
Start by defining all of the leaf types
For every type that has a non-trivial (freeing memory) destructor, implement IDisposable
Turn on the FxCop rule that checks for lack of Dispose calls to catch all of the places use used stack based RAII and missed it
Pay very close attention to the uses of byref in C++.
I haven't tried it, but I just googled it and found this: http://code2code.net/
According t it, you shouldn't fully rely on the code it produces:
You accept that this page does only half the work.
Futher work on your part is required. In most cases, the translated code will not even compile.
Also, read this: Translate C++/CLI to C#
Related
I'm writing a console tool to generate some C# code for objects in a class library. The best/easiest way I can actual generate the code is to use reflection after the library has been built. It works great, but this seems like a haphazard approch at best. Since the generated code will be compiled with the library, after making a change I'll need to build the solution twice to get the final result, etc. Some of these issues could be mitigated with a build script, but it still feels like a bit too much of a hack to me.
My question is, are there any high-level best practices for this sort of thing?
Its pretty unclear what you are doing, but what does seem clear is that you have some base line code, and based on some its properties, you want to generate more code.
So the key issue here are, given the base line code, how do you extract interesting properties, and how do you generate code from those properties?
Reflection is a way to extract properties of code running (well, at least loaded) into the same execution enviroment as the reflection user code. The problem with reflection is it only provides a very limited set of properties, typically lists of classes, methods, or perhaps names of arguments. IF all the code generation you want to do can be done with just that, well, then reflection seems just fine. But if you want more detailed properties about the code, reflection won't cut it.
In fact, the only artifact from which truly arbitrary code properties can be extracted is the the source code as a character string (how else could you answer, is the number of characters between the add operator and T in middle of the variable name is a prime number?). As a practical matter, properties you can get from character strings are generally not very helpful (see the example I just gave :).
The compiler guys have spent the last 60 years figuring out how to extract interesting program properties and you'd be a complete idiot to ignore what they've learned in that half century.
They have settled on a number of relatively standard "compiler data structures": abstract syntax trees (ASTs), symbol tables (STs), control flow graphs (CFGs), data flow facts (DFFs), program triples, ponter analyses, etc.
If you want to analyze or generate code, your best bet is to process it first into such standard compiler data structures and then do the job. If you have ASTs, you can answer all kinds of question about what operators and operands are used. If you have STs, you can answer questions about where-defined, where-visible and what-type. If you have CFGs, you can answer questions about "this-before-that", "what conditions does statement X depend upon". If you have DFFs, you can determine which assignments affect the actions at a point in the code. Reflection will never provide this IMHO, because it will always be limited to what the runtime system developers are willing to keep around when running a program. (Maybe someday they'll keep all the compiler data structures around, but then it won't be reflection; it will just finally be compiler support).
Now, after you have determined the properties of interest, what do you do for code generation? Here the compiler guys have been so focused on generation of machine code that they don't offer standard answers. The guys that do are the program transformation community (http://en.wikipedia.org/wiki/Program_transformation). Here the idea is to keep at least one representation of your program as ASTs, and to provide special support for matching source code syntax (by constructing pattern-match ASTs from the code fragments of interest), and provide "rewrite" rules that say in effect, "when you see this pattern, then replace it by that pattern under this condition".
By connecting the condition to various property-extracting mechanisms from the compiler guys, you get relatively easy way to say what you want backed up by that 50 years of experience. Such program transformation systems have the ability to read in source code,
carry out analysis and transformations, and generally to regenerate code after transformation.
For your code generation task, you'd read in the base line code into ASTs, apply analyses to determine properties of interesting, use transformations to generate new ASTs, and then spit out the answer.
For such a system to be useful, it also has to be able to parse and prettyprint a wide variety of source code langauges, so that folks other than C# lovers can also have the benefits of code analysis and generation.
These ideas are all reified in the
DMS Software Reengineering Toolkit. DMS handles C, C++, C#, Java, COBOL, JavaScript, PHP, Verilog, ... and a lot of other langauges.
(I'm the architect of DMS, so I have a rather biased view. YMMV).
Have you considered using T4 templates for performing the code generation? It looks like it's getting much more publicity and attention now and more support in VS2010.
This tutorial seems database centric but it may give you some pointers: http://www.olegsych.com/2008/09/t4-tutorial-creatating-your-first-code-generator/ in addition there was a recent Hanselminutes on T4 here: http://www.hanselminutes.com/default.aspx?showID=170.
Edit: Another great place is the T4 tag here on StackOverflow: https://stackoverflow.com/questions/tagged/t4
EDIT: (By asker, new developments)
As of VS2012, T4 now supports reflection over an active project in a single step. This means you can make a change to your code, and the compiled output of the T4 template will reflect the newest version, without requiring you to perform a second reflect/build step. With this capability, I'm marking this as the accepted answer.
You may wish to use CodeDom, so that you only have to build once.
First, I would read this CodeProject article to make sure there are not language-specific features you'd be unable to support without using Reflection.
From what I understand, you could use something like Common Compiler Infrastructure (http://ccimetadata.codeplex.com/) to programatically analyze your existing c# source.
This looks pretty involved to me though, and CCI apparently only has full support for C# language spec 2. A better strategy may be to streamline your existing method instead.
I'm not sure of the best way to do this, but you could do this
As a post-build step on your base dll, run the code generator
As another post-build step, run csc or msbuild to build the generated dll
Other things which depend on the generated dll will also need to depend on the base dll, so the build order remains correct
I need to import some ANSI C code into a project I'm working on. For reasons I prefer not to go into, I want to refactor the code to C# rather than trying to wrap the original code. It will take me perhaps a couple of days to do the work by hand, but before I start, is there an automated tool that can get me most of the way there? I'm a cheapskate and the work I'm doing is pro bono, so free tools only please.
If manual refactoring is "only" going to take a few days, that would get my vote. Depending on what the C code is doing and how it is written (pointers, custom libraries, etc.) an automated converter may just make a mess. And untangling that mess could be a larger task than just converting by hand.
Setting up, cleaning up, refactoring, and just plain making converted code (if there is even a converter available) would probably be something in the magnitude of weeks or months, so you'll be better served just to go ahead with the manual rewrite.
There is an experimental CLI Back-End and Front-End for GCC. It is already capable of compiling a subset of C programs into CIL, the byte-code that the CLR runs.
(The webpage makes it seem like the code was only developed over a few months and then ignored since then, but it's out of date; ST Microelectronics is continuing maintenance and development.)
You don't specify why you want a C to C# translator, but if you just want to get C and C# to play together without P/Invoke or COM, it might be good enough.
It may make sense to start by getting the existing code to compile as managed C++ aka C++/CLI. Assuming that went smoothly enough, then you have a working, testable foundation on which to build. Move key features to their own classes, and as needed, rewrite as C# along the way.
I am writing a fairly simple code gen tool, and I need the ability to convert MSIL (or MethodInfo) objects to their C# source. I realize Reflector does a great job of this, but it has the obnoxious "feature" of being UI only.
I know I could just generate the C# strings directly, using string.Format to insert the variable portions, but I'd really prefer to be able to generate methods programatically (e.g. a delegate or MethodInfo object), then pass those methods to a writer which would convert them to C#.
It seems a little silly that the System libraries make it so easy to go from a C# source code string to a compiled (and executable) method at runtime, but impossible to go from an object to source code--even for simple things.
Any ideas?
This add-in for Reflector lets you output to a file, and it is possible to run Reflector from the command line. It's probably simpler to get that to do what you want than to roll your own decompiler.
Anakrino is another decompiler with a command line option, but it hasn't been updated since .NET 1.1. It's open source, though, so you might be able to base your solution on it.
The code generators which I wrote as part of our application use String.Format, and although I am not entirely happy with String.Format (Lisp macros beat it any time of the day) it does the job all right. The C# compiler is going to re-check all your generated methods anyway, so you gain nothing by round-tripping through Reflection.Emit and the (still unwritten) MSIL-to-C# decompiler.
PS: why not use CodeDOM if you want to use something heavyweight?
I was wondering about what makes the primary Java compiler (javac by sun) so fast at compilation?
..as well as the C# .NET compiler from Microsoft.
I am comparing them with C++ compilers (such as G++), so maybe my question should have been, what makes C++ compilers so slow :)
That question was nicely answered in this one: Why does C++ compilation take so long? (as jalf pointed out in the comments section)
Basically it's the missing modules concept of C++, and the aggressive optimization done by the compiler.
I think the most difficult part is not the need to compile the header files (unless they are really big, but you can use precompiled headers in that case). The worst part is always the fact that C++'s grammar is too wildly context-sensitive. Despite the fact I like C++, I feel sorry for anybody who has to write a C++ parser.
There are a couple of things that make the C++ compiler slower than those of Java/C#. The grammar is much more complex, generic programming support is much more powerfull in C++, but at the same time it is more expensive to compile. Inclusion of files work in a different way than importing modules.
Inclussion of header files
First, whenever you include a file in C++ the contents of the file (.h usually) are injected in the current compilation unit (include guards avoid reinjecting the same header twice), and this is transitive. That is, if you include header a.h, that in turns includes b.h, your compilation unit will include all code in a.h and all code in b.h.
Java (or C#, I will talk about Java, but they are similar in this) don't have include files, they depend on the binaries from the compilation of the used classes. This means that whenever you compile a.java that uses an object B defined in b.java, it just checks the binary b.class, it does not need to go deeper to check the dependencies of B, so it can cut the process earlier (with just one level of checking).
At the same time, including files only includes the language definitions, and processing it requires time. When the Java/C# compiler reads a binary it has the same information but already processed by the compilation step that generated it.
So at the end, in C/C++ more files are included and at the same time processing of those includes is more expensive than processing of binary modules.
Templates
Templates are special in their own way. They can be precompiled, but they are usually not (for a good set of reasons). This means that in all compilation units that use std::vector the whole set of vector methods used (unused template methods don't get compiled) is processed and the binary code generated by the compiler. At a later step, during linking, redundant definitions of the same method will get dropped, but during compilation they must be processed.
Support in Java for generics is more limited in many ways. At the end, for example, there is only one Vector class binary, and whenever the compiler sees Vector in java what it does is generating type checking code before delegating to the real Vector implementation (that stores plain Object) and that is not generic. The compiler does provide the type warranties, but does not compile Vector for each type.
In C# it is, once again, different. C# support for generics is more complex than that of Java, and at the end generic classes are different than plain classes but anyway they get compiled only once as the binary format has all required information.
Because they do something quite different, C++ compiler produces optimized native code whereas C#, VB .Net and Java compiler produce an intermidiate language than when you first execute the application is turned into native code, and that is why you get slow loading of application in Java etc. the first time you execute the application.
The C++ compiler has to do the full optimization where the JITed languages optimize when you execute the application.
Someone would argue that you have to measure C++ compile time = Java compile time + time for JITing the first time you load the application if you want to be correct, but i don't think that would be right and fair because you are comparing native languages to JITed, or else oranges to apples.
The C++ compiler must repeatedly compile all the header files and there are lots of them, so this is one thing that slows it down.
One of the more time consuming tasks when compiling is code optimization.
Javac does very little optimization on the code when doing the compilation. Optimization is instead done by the JVM when running the application.
A C/C++ needs to be optimized when compiling since optimization of compiled machine code is hard.
You got it right in your last sentence: it's not java or C# that's fast to compile, it's C++ that is exceptionally slow to compile, due to its complex grammar and features, most importantly templates
If you think javac is fast try Jikes.... (see http://jikes.sourceforge.net/)
It is a Java Compiler written in C++. Unfortunately they haven't kept up with the latest Java Compiler specs but if you want to see fast this is it.
Tony
I think part of it is the complexity of the languages. C++ is incredibly mutable, with the ability to override pretty much any operator or piece of syntax (like overriding the () operator). This means the compiler has to do a lot more work just to determine what operations to actually run, even for simple things. Java and C# don't have this issue, as the syntax is fixed, and they're generally much simpler to parse.
It's a bit difficult comparing bytecode languages like java with natively compiled languages like C++. A better comparison is Delphi vs C++, where Delphi is much faster to compile. Since this has nothing to do with optimization or byte code, it must be due to differences in language syntax and the relative performance of includes vs. modules/units.
Is Java compiler fast?
The Java to class translation shall be blindingly fast since it is just a glorified zip with some syntax checking so to be fair if compared to a real compiler that is doing optimization and object code generation the "translation" from Java to class is trivial.
Did a comparison with fairly small program "hello world" and-and compare to GCC (C/C++/Ada) and found that javac was 30 times slower, and it got even worse in runtime?
Alright so I have this C++ image capturing class. I was wondering if I could get some help..I know basic C++ (I have done one intro to c and one intro to c++ class) and I have NO idea how to do this:
I need to use this class (ie create a new c++ project in my solution) and use c# to reference it and use it to save a screenshot of the screen.
When I try to add a new project I dont know which one to choose (win32, mfc, atl, clr, abc, xyz, and so on) .
The image capturing class is here:
http://pastebin.com/m72414ab
I don't know if I need to create .h files or just a .cpp file would do.. like honestly I have no clue where to start lol. I am looking for a solution, but i want to learn in the process so I dont have to ask next time (not to mention that I like c++ so I am gonna continue coding with it)
You cannot easily use C++ classes from C# without knowing some somewhat specialized information about C++/CLI - either rewrite your C++ class in C and use P/Invoke, or find a fully C# solution.
But I'd like to use this c++ class for speed and memory.
I question this, unless you are capturing images thousands of times a second, there's no way choosing C++ would be of any benefit, and it makes your program much more complicated. Stick with C# until you know you need the slight performance boost.
.NET version 2.0 and above include a CopyFromScreen method on the Graphics object. You can create an image, obtain a drawing surface (Graphics) for that, and then copy from the screen into the image.
A quick bit of Googling produced a simple tutorial which demonstrates this using Visual Basic .NET, although it's trivial to port to C#.
This solution results in the same GDI+ calls that your C++ version uses, with the benefit of not having to link in external C++ code (which, as mentioned above, is not straightforward).
It is definitely possible to use C++ constructs from C#, but it does require some dirty tricks. If I remember correctly, last time I tried to do this I used a Managed C++ layer between C++ and C#, and I think this is the most common way to do it. This method worked well, but indeed was unnecessarily complicated.