Easily porting Lua code to C# - c#

is there any easy way to port Lua code to C#?
The biggest problem would probably be to port tables neatly in some dictionaries.
And to prevent any misunderstanding: no I cannot use embedded Lua in my program.

Code designed in such a highly dynamic language like Lua would need substantial refactoring to make sense in a static language like C#- the two serve fundamentally different purposes. You would have to re-write such code again from scratch, realistically, unless it used only the most basic features of any language, like basic numerical/string ops.

There's no easy way to do that.

Universal-transpiler can translate a small subset of Lua into several other languages, including C#. This is an example, written for SWI-Prolog:
:- use_module(library(transpiler)).
:- set_prolog_flag(double_quotes,chars).
:- initialization(main).
main :-
translate("function add(a,b) return a + b end function squared(a) return a*a end function add_exclamation_point(parameter) return parameter .. \"!\" end",'lua','c#',X),
atom_chars(Y,X),
writeln(Y).
This is the C# source code that it generates:
public static int add(int a,int b){
return a+b;
}
public static int squared(int a){
return a*a;
}
public static string add_exclamation_point(string parameter){
return parameter+"!";
}

What do you want to achieve? Convert the lua files to C# code, where you want to work with them extensively, or you just want some code that does similar things than the original code.
For the first type of conversion the answer is that it is quite hard, but not impossible. You have to parse the code and re-create the same (dynamic) functionality in C#. Frameworks, like LinFu.Reflection can help here, because they will add some dynamic functionality to CLR.
For the second type, my idea is to convert the lua bytecode to C# instead of the original code. This shouldn't be too hard, mainly because lua doesn't have much opcodes (around 30 if I remember it correctly). From these opcodes the hardest to convert are the logic and jump operators (because you don't have goto in C#), but if you keep the flow operators intact (and convert them to C# - that is more or less accomplishable), and only compile the code between, and convert the result bytecode to C# should do the job. Of course this way you'll lose a lot from the readibility of the original code, and maintaining it will be much harder.
You might also try to find a solution between these two edge cases I've written here. Some constructs can be easily ported (mainly the loops, and simple arithmetic operators), but fall back to the opcode representation for table handling.

Related

Is more performant to search a small array or use an if statement in c#?

Suppose I'm writing a parser and I need to check if the current token returned by Scanner::NextToken (for example) is one of a small set of values (say 5-10 items; few less or few more).
In this small open source project (https://github.com/gsscoder/exprengine), inside the Parser class I've declared various static arrays that I query with Array::Contains() (see Parser::Ensure() method).
I'm guessing if I can gain in performance using the same technique used in the scanner for check tokens, that is an helper method that uses an if statement (like the following):
private static bool IsLineTerminator(int c)
{
return c == 0x0A || c == 0x0D || c == 0x2028 || c == 0x2029;
}
Or maybe that also in the Scanner, I should use technique used in the Parser?
Any opinion (well motivated) will be appreciated; just don't suggest to generate parser/scanner using tools like ANTLR - I want to keep an hand-written implementation.
Regards, Giacomo
Essentially that's exactly what Array.Contains is doing. You'll have a slightly more involved call stack using Contains as it's not going to be inlined to that degree, but the basic idea of what's happening is the same. It's unlikely you'll see a dramatic performance difference, but by all means profile the two methods and see for yourself. The best way of knowing which method is faster is just to try it, not to ask random strangers.
Another option to consider for an actual algorithm change which would potentially be faster is to use a HashSet as opposed to an array. For only 4 values the speed difference is likely to be small, but a hash-based data structure is specifically designed for much faster searching. (It's at least worth testing that as well). A switch statement will also be implemented as a hash-based solution, so you could consider using that as well.

From .Net to Java

Which would be the best way to write the same code in Java?:
Array.Resize(ref DCamposGraficoOperaciones[index].DatosMeses, 12);
I have this code in C# and I have to put it on in Java. Thanks so much. I have a method called resizeDatosMeses in Java to resize the array but when I try to do it in this way:
DCamposGraficoOperaciones[index].getDatosMeses()=resizeDatosMeses(DCamposGraficoOperaciones[index].getDatosMeses(), 12)
I have a mistake which is: The left-hand side of an assignment must be a variable, please could you advice me?
Thanks so much
I suspect you want:
DCamposGraficoOperaciones[index].setDatosMeses(
resizeDatosMeses(DCamposGraficoOperaciones[index].getDatosMeses(), 12));
You can't assign to the result of a method call, which is what you were trying to do before.
I would strongly encourage you to break this line into two separate statements - and also start following Java naming conventions, which would prohibit DCamposGraficoOperaciones as a variable name.
Also, it's not clear why you're resizing an array to start with, but in both .NET and Java you may well be better off using a higher-level abstraction, e.g. List<T> in .NET or ArrayList<T> in Java.
In java you cant re-size arrays. You can use ArrayList and you can add any number of values to it.

Go To Statement Considered Harmful?

If the statement above is correct, then why when I use reflector on .Net BCL I see it is used a lot?
EDIT: let me rephrase: are all the GO-TO's I see in reflector written by humans or compilers?
I think the following excerpt from the Wikipedia Article on Goto is particularly relevant here:
Probably the most famous criticism of
GOTO is a 1968 letter by Edsger
Dijkstra called Go To Statement
Considered Harmful. In that letter
Dijkstra argued that unrestricted GOTO
statements should be abolished from
higher-level languages because they
complicated the task of analyzing and
verifying the correctness of programs
(particularly those involving loops).
An alternative viewpoint is presented
in Donald Knuth's Structured
Programming with go to Statements
which analyzes many common programming
tasks and finds that in some of them
GOTO is the optimal language construct
to use.
So, on the one hand we have Edsger Dijkstra (a incredibly talented computer scientist) arguing against the use of the GOTO statement, and specifically arguing against the excessive use of the GOTO statement on the grounds that it is a much less structured way of writing code.
On the other hand, we have Donald Knuth (another incredibly talented computer scientist) arguing that using GOTO, especially using it judiciously can actually be the "best" and most optimal construct for a given piece of program code.
Ultimately, IMHO, I believe both men are correct. Dijkstra is correct in that overuse of the GOTO statement certainly makes a piece of code less readable and less structured, and this is certainly true when viewing computer programming from a purely theoretical perspective.
However, Knuth is also correct as, in the "real world", where one must take a pragmatic approach, the GOTO statement when used wisely can indeed be the best choice of language construct to use.
The above isn't really correct - it was a polemical device used by Dijkstra at a time when gotos were about the only flow control structure in use. In fact, several people have produced rebuttals, including Knuth's classic "Structured Programming Using Goto" paper (title from memory). And there are some situations (error handling, state machines) where gotos can produce clearer code (IMHO), than the "structured" alternatives.
These goto's are very often generated by the compiler, especially inside enumerators.
The compiler always knows what she's doing.
If you find yourself in the need to use goto, you should make sure it is the only option. Most often you'll find there's a better solution.
Other than that, there are very few instances the use of goto can be justified, such as when using nested loops. Again, there are other options in this case still. You could break out the inner loop in a function and use a return statement instead. You need to look closely if the additional method call is really too costly.
In response to your edit:
No, not all gotos are compiler generated, but a lot of them result from compiler generated state machines (enumerators), switch case statements or optimized if else structures. There are only a few instances you'll be able to judge whether it was the compiler or the original developer. You can get a good hint by looking at the function/class name, a compiler will generate "forbidden" names to avoid name clashes with your code. If everything looks normal and the code has not been optimized or obfuscated the use of goto is probably intended.
Keep in mind that the code you are seeing in Reflector is a disassembly -- Reflector is looking at the compiled byte codes and trying to piece together the original source code.
With that, you must remember that rules against gotos apply to high-level code. All the constructs that are used to replace gotos (for, while, break, switch etc) all compile down to code using JMPs.
So, Reflector looks at code much like this:
A:
if !(a > b)
goto B;
DoStuff();
goto A;
B: ...
And must realize that it was actually coded as:
while (a > b)
DoStuff();
Sometimes the code being read to too complicated for it to recognize the pattern.
Go To statement itself is not harmful, it is even pretty useful sometimes. Harmful are users who tend to put it in inappropriate places in their code.
When compiled down to assembly code, all control structured and converted to (un)conditional jumps. However, the optimizer may be too powerful, and when the disassembler cannot identify what control structure a jump pattern corresponds to, the always-correct statement, i.e. goto label; will be emitted.
This has nothing to do with the harm(ful|less)ness of goto.
what about a double loop or many nested loops, of which you have break out, for ex.
foreach (KeyValuePair<DateTime, ChangedValues> changedValForDate in _changedValForDates)
{
foreach (KeyValuePair<string, int> TypVal in changedValForDate.Value.TypeVales)
{
RefreshProgress("Daten werden geändert...", count++, false);
if (IsProgressCanceled)
{
goto TheEnd; //I like goto :)
}
}
}
TheEnd:
in this case you should consider that here the following should be done with break:
foreach(KeyValuePair<DateTime, ChangedValues> changedValForDate in _changedValForDates)
{
foreach (KeyValuePair<string, int> TypVal in changedValForDate.Value.TypeVales)
{
RefreshProgress("Daten werden geändert...", count++, false);
if (IsProgressCanceled)
{
break; //I miss goto :|
}
}
if (IsProgressCanceled)
{
break; //I really miss goto now :|
}//waaaAA !! so many brakets, I hate'm
}
The general rule is that you don't need to use goto. As with any rule there are of course exceptions, but as with any exceptions they are few.
The goto command is like a drug. If it's used in limited amounts only in special situations, it's good. If you use too much all the time, it will ruin your life.
When you are looing at the code using Reflector, you are not seeing the actual code. You are seeing code that is recreated from what the compiler produced from the original code. When you see a goto in the recreated code, it's not certain that there was a goto in the original code. There might be a more structured command to control the flow, like a break or a continue which has been implemented by the compiler in the same way as a goto, so that Reflector can't tell the difference.
goto considered harmful (for human to use but
for computers its okay).
because no matter how madly we(human) use goto, compiler always knows how to read the code.
Believe me...
Reading others code with gotos in it is HARD. Reading your own code with gotos in it is HARDER.
That is why you see it used in low level (machine languages) and not in high level (human languages e.g. C#,Python...) ;)
"C provides the infinitely-abusable goto statement, and labels to branch to. Formally, the goto is never necessary, and in practice it is almost always easy to write code without it. We have not used goto in this book."
-- K&R (2nd Ed.) : Page 65
I sometimes use goto when I want to perform a termination action:
static void DoAction(params int[] args)
{
foreach (int arg in args)
{
Console.WriteLine(arg);
if (arg == 93) goto exit;
}
//edit:
if (args.Length > 3) goto exit;
//Do another gazillion actions you might wanna skip.
//etc.
//etc.
exit:
Console.Write("Delete resource or whatever");
}
So instead of hitting return, I send it to the last line that performs another final action I can refer to from various places in the snippet instead of just terminating.
In decompiled code, virtually all gotos that you see will be synthetic. Don't worry about them; they're an artifact of how the code is represented at the low level.
As to valid reasons for putting them in your own code? The main one I can think of is where the language you are using does not provide a control construct suitable for the problem you are tackling; languages which make it easy to make custom control flow systems typically don't have goto at all. It's also always possible to avoid using them at all, but rearranging arbitrarily complex code into a while loop and lots of conditionals with a whole battery of control variables... that can actually make the code even more obscure (and slower too; compilers usually aren't smart enough to pick apart such complexity). The main goal of programming should be to produce a description of a program that is both clear to the computer and to the people reading it.
If it's harmful or not, it's a matter of likes and dislikes of each one. I personally don't like them, and find them very unproductive as they attempt maintainability of the code.
Now, one thing is how gotos affect our reading of the code, while another is how the jitter performs when found one. From Eric Lippert's Blog, I'd like to quote:
We first run a pass to transform loops into gotos and labels.
So, in fact the compiler transforms pretty each flow control structure into goto/label pattern while emitting IL. When reflector reads the IL of the assembly, it recognizes the pattern, and transforms it back to the appropriate flow control structure.
In some cases, when the emitted code is too complicated for reflector to understand, it just shows you the C# code that uses labels and gotos, equivalent to the IL it's reading. This is the case for example when implementing IEnumerable<T> methods with yield return and yield break statements. Those kind of methods get transformed into their own classes implementing the IEnumerable<T> interface using an underlying state machine. I believe in BCL you'll find lots of this cases.
GOTO can be useful, if it's not overused as stated above. Microsoft even uses it in several instances within the .NET Framework itself.
These goto's are very often generated by the compiler, especially inside enumerators. The compiler always knows what she's doing.
If you find yourself in the need to use goto, you should make sure it is the only option. Most often you'll find there's a better solution.
Other than that, there are very few instances the use of goto can be justified, such as when using nested loops. Again, there are other options in this case still. You could break out the inner loop in a function and use a return statement instead. You need to look closely if the additional method call is really too costly.

Where do Label_ markers come from in Reflector and how to decipher them?

I'm trying to understand a method using the disassembly feature of Reflector. As anyone that's used this tool will know, certain code is displayed with C# labels that were (presumably) not used in the original source.
In the 110 line method I'm looking at there are 11 label statements. Random snippet examples:
Label_0076:
if (enumerator.MoveNext())
{
goto Label_008F;
}
if (!base.IsValid)
{
return;
}
goto Label_0219;
Label_0087:
num = 0;
goto Label_01CB;
Label_01CB:
if (num < entityArray.Length)
{
goto Label_0194;
}
goto Label_01AE;
Label_01F3:
num++;
goto Label_01CB;
What sort of code makes Reflector display these labels everywhere and why can't it disassemble them?
Is there a good technique for deciphering them?
Actually, the C# compiler doesn't do much of any optimization - it leaves that to the JIT compiler (or ngen). As such, the IL it generates is pretty consistent and predictable, which is why tools like Reflector are able to decompile IL so effectively. One situation where the compiler does transform your code is in an iterator method. The method you're looking at probably contained something along the lines of:
foreach(var x in something)
if(x.IsValid)
yield return x;
Since the iterator transformation can be pretty complex, Reflector can't really deal with it. To get familiar with what to look for, write your own iterator methods and run them through Reflector to see what kind of IL gets generated based on your C# code. Then you'll know what to look for.
You're looking at code generated by the compiler. The compiler doesn't respect you. No, really. It doesn't respect me or anybody else, either. It looks at our code, scoffs at us, and rewrites it to run as efficiently as possible.
Nested if statements, recursion, "yield"s, case statements, and other code shortcuts will result in weird looking code. And if you're using lambdas with lots of enclosures, well, don't expect it to be pretty.
Any where, any chance the compiler can rewrite your code to make it run faster it will. So there isn't any one "sort of code" that will cause this. Reflector does its best to disassemble, but it can't divine the author's original code from its rewritten version. It does its best (which sometimes is even incorrect!) to translate IL into some form of acceptable code.
If you're having a hard time deciphering it, you could manually edit the code to inline goto's that only get called once and refactor goto's that are called more than once into method calls. Another alternative is to disassemble into another language. The code that translates IL into higher level languages isn't the same. The C++/CLI decompiler may do a better job for you and still be similar enough (find/replace -> with .) to be understandable.
There really isn't a silver bullet for this; at least not until somebody writes a better disassembler plugin.

How will you use the C# 4 dynamic type?

C# 4 will contain a new dynamic keyword that will bring dynamic language features into C#.
How do you plan to use it in your own code, what pattern would you propose ? In which part of your current project will it make your code cleaner or simpler, or enable things you could simply not do (outside of the obvious interop with dynamic languages like IronRuby or IronPython)?
PS : Please if you don't like this C# 4 addition, avoid to bloat comments negatively.
Edit : refocussing the question.
The classic usages of dynamic are well known by most of stackoverflow C# users. What I want to know is if you think of specific new C# patterns where dynamic can be usefully leveraged without losing too much of C# spirit.
Wherever old-fashioned reflection is used now and code readability has been impaired. And, as you say, some Interop scenarios (I occasionally work with COM).
That's pretty much it. If dynamic usage can be avoided, it should be avoided. Compile time checking, performance, etc.
A few weeks ago, I remembered this article. When I first read it, I was frankly appalled. But what I hadn't realised is that I didn't know how to even use an operator on some unknown type. I started wondering what the generated code would be for something like this:
dynamic c = 10;
int b = c * c;
Using regular reflection, you can't use defined operators. It generated quite a bit of code, using some stuff from a Microsoft namespace. Let's just say the above code is a lot easier to read :) It's nice that it works, but it was also very slow: about 10,000 times slower than a regular multiplication (doh), and about 100 times slower than an ICalculator interface with a Multiply method.
Edit - generated code, for those interested:
if (<Test>o__SiteContainer0.<>p__Sitea == null)
<Test>o__SiteContainer0.<>p__Sitea =
CallSite<Func<CallSite, object, object, object>>.Create(
new CSharpBinaryOperationBinder(ExpressionType.Multiply,
false, false, new CSharpArgumentInfo[] {
new CSharpArgumentInfo(CSharpArgumentInfoFlags.None, null),
new CSharpArgumentInfo(CSharpArgumentInfoFlags.None, null) }));
b = <Test>o__SiteContainer0.<>p__Site9.Target(
<Test>o__SiteContainer0.<>p__Site9,
<Test>o__SiteContainer0.<>p__Sitea.Target(
<Test>o__SiteContainer0.<>p__Sitea, c, c));
The dynamic keyword is all about simplifying the code required for two scenarios:
C# to COM interop
C# to dynamic language (JavaScript, etc.) interop
While it could be used outside of those scenarios, it probably shouldn't be.
Recently I have blogged about dynamic types in C# 4.0 and among others I mentioned some of its potential uses as well as some of its pitfalls. The article itself is a bit too big to fit in here, but you can read it in full at this address.
As a summary, here are a few useful use cases (except the obvious one of interoping with COM libraries and dynamic languages like IronPython):
reading a random XML or JSON into a dynamic C# object. The .Net framework contains classes and attributes for easily deserializing XML and JSON documents into C# objects, but only if their structure is static. If they are dynamic and you need to discover their fields at runtime, they can could only be deserialized into dynamic objects. .Net does not offer this functionality by default, but it can be done by 3rd party tools like jsonfx or DynamicJson
return anonymous types from methods. Anonymous types have their scope constrained to the method where they are defined, but that can be overcome with the help of dynamic. Of course, this is a dangerous thing to do, since you will be exposing objects with a dynamic structure (with no compile time checking), but it might be useful in some cases. For example the following method reads only two columns from a DB table using Linq to SQL and returns the result:
public static List<dynamic> GetEmployees()
{
List<Employee> source = GenerateEmployeeCollection();
var queyResult = from employee in source
where employee.Age > 20
select new { employee.FirstName, employee.Age };
return queyResult.ToList<dynamic>();
}
create REST WCF services that returns dynamic data. That might be useful in the following scenario. Consider that you have a web method that returns user related data. However, your service exposes quite a lot of info about users and it will not be efficient to just return all of them all of the time. It would be better if you would be able to allow consumers to specify the fields that they actually need, like with the following URL
http://api.example.com/users?userId=xxxx&fields=firstName,lastName,age
The problem then comes from the fact that WCF will only return to clients responses made out of serialized objects. If the objects are static then there would be no way to return dynamic responses so dynamic types need to be used. There is however one last problem in here and that is that by default dynamic types are not serializable. In the article there is a code sample that shows how to overcome this (again, I am not posting it here because of its size).
In the end, you might notice that two of the use cases I mentioned require some workarounds or 3rd party tools. This makes me think that while the .Net team has added a very cool feature to the framework, they might have only added it with COM and dynamic languages interop in mind. That would be a shame because dynamic languages have some strong advantages and providing them on a platform that combines them with the strengths of strong typed languages would probably put .Net and C# ahead of the other development platforms.
Miguel de Icaza presented a very cool use case on his blog, here (source included):
dynamic d = new PInvoke ("libc");
d.printf ("I have been clicked %d times", times);
If it is possible to do this in a safe and reliable way, that would be awesome for native code interop.
This will also allow us to avoid having to use the visitor pattern in certain cases as multi-dispatch will now be possible
public class MySpecialFunctions
{
public void Execute(int x) {...}
public void Execute(string x) {...}
public void Execute(long x) {...}
}
dynamic x = getx();
var myFunc = new MySpecialFunctions();
myFunc.Execute(x);
...will call the best method match at runtime, instead of being worked out at compile time
I will use it to simplify my code which deals with COM/Interop where before I had to specify the member to invoke, its parameters etc. (basically where the compiler didn't know about the existence of a function and I needed to describe it at compile time). With dynamic this gets less cumbersome and the code gets leaner.

Categories

Resources