Dynamically evaluating statement with multiple equality operators - c#

I've seen quite a few expression parsers / tokenizers that can take a certain string and evaluate the result. For example, you could pump the string:
4+4
into the following code:
MSScriptControl.ScriptControl sc = new MSScriptControl.ScriptControl();
//' You always need to initialize a language engine
sc.Language = "VBScript";
//' this is the expression - in a real program it will probably be
//' read from a textbox control
string expr = "4+4";
double res = sc.Eval(expr);
and get 8. But, is there a parsing tool out there that can evaluate the string:
4 = 4 = 4
? So far, all examples fail with an error of not being able to compare a double and boolean (which makes sense from a compilers perspective, but from a human perspective, we can see that this is true). Anyone come across something that can achieve this?

From a human perspective, this is only true if we think of x = y = z as a special operator (with three operands), where it implies x = y, y = z, x = z. That is a specific syntactical interpretation of the expression. A human (particularly a programmer) could also interpret it the same way most compilers do, which is to choose the left-most grouping ( x = y ) and then compare the result of that comparison (a boolean value) to z. Even to a human, this doesn't make sense under this syntax. It only seems obvious from a human perspective because humans are notoriously fuzzy when it comes to choosing a syntax that 'makes sense' for a given context.
If you really want that level of 'fuzziness', you'll need to look into something like Wolfram Alpha, which performs contextual analysis to try to find a best guess for the meaning of the expression. If you enter '4 = 4 = 4' there, it will reply True.

You need to define syntax for your "language" and build parser as your expected behavior is not covered by normal expression syntax (and also normal languages should evaluate it to "false" as every language I heard of implements = as binary operation and hence will endup with "4 = true" at some point). There are tools to build parser for C#...
Side note: to match "a human perspective" is insanely hard problem still not solved even for human to human communication :).

try with
var result = (int) HtmlPage.Window.Eval("4 + 4");

Related

Math.net Powers of Trigonometric Functions in C#

How do I get powers of a trig function in Math.net?
Expr x = Expr.Variable("x");
Expr g = (2 * x).Sinh().Pow(2);
g.ToString() gives the output: (sinh(2*x))^2
What I want is sinh^2(2*x)
How do I do that?
Edit:
As per Christoph's comment below this can be done in v0.21.0 which implements Expr.ToCustomString(true)
It does not currently support this notation. However, I see three options:
We define powers of trigonometric functions as new functions, or make them parametric.
This is something I do not wish to do.
We introduce un-applied functions as first class concept, which can be manipulated in such ways.
This is something we likely want to explore at some point, but this is a larger topic, and likely overkill for what you need.
Extend our visual expressions to support positive integer powers of functions.
This is something which we could implement.
Option 3 would save one set of parentheses in such expressions, which would lead to a more compact rendering, especially also for the LaTeX formatter where the result would be more readable. When building a visual expression from an expression, it would automatically pull positive integer powers to the applied function.
To my understanding your concern is only about how it is printed, so it seems to me this would solve your problem as well?

string(";P") is bigger or string("-_-") is bigger?

I found very confusing when sorting a text file. Different algorithm/application produces different result, for example, on comparing two string str1=";P" and str2="-_-"
Just for your reference here gave the ASCII for each char in those string:
char(';') = 59; char('P') = 80;
char('-') = 45; char('_') = 95;
So I've tried different methods to determine which string is bigger, here is my result:
In Microsoft Office Excel Sorting command:
";P" < "-_-"
C++ std::string::compare(string &str2), i.e. str1.compare(str2)
";P" > "-_-"
C# string.CompareTo(), i.e. str1.CompareTo(str2)
";P" < "-_-"
C# string.CompareOrdinal(), i.e. CompareOrdinal(w1, w2)
";P" > "-_-"
As shown, the result varied! Actually my intuitive result should equal to Method 2 and 4, since the ASCII(';') = 59 which is larger than ASCII('-') = 45 .
So I have no idea why Excel and C# string.CompareTo() gives a opposite answer. Noted that in C# the second comparison function named string.CompareOrdinal(). Does this imply that the default C# string.CompareTo() function is not "Ordinal" ?
Could anyone explain this inconsistency?
And could anyone explain in CultureInfo = {en-US}, why it tells ;P > -_- ? what's the underlying motivation or principle? And I have ever heard about different double multiplication in different cultureInfo. It's rather a cultural shock..!
?
std::string::compare: "the result of a character comparison depends only on its character code". It's simply ordinal.
String.CompareTo: "performs a word (case-sensitive and culture-sensitive) comparison using the current culture". So,this not ordinal, since typical users don't expect things to be sorted like that.
String::CompareOrdinal: Per the name, "performs a case-sensitive comparison using ordinal sort rules".
EDIT: CompareOptions has a hint: "For example, the hyphen ("-") might have a very small weight assigned to it so that "coop" and "co-op" appear next to each other in a sorted list."
Excel 2003 (and earlier) does a sort ignoring hyphens and apostrophes, so your sort really compares ; to _, which gives the result that you have. Here's a Microsoft Support link about it. Pretty sparse, but enough to get the point across.

Clean and natural scripting functionality without parsing

I'm experimenting with creating a semi-natural scripting language, mostly for my own learning purposes, and for fun. The catch is that it needs to be in native C#, no parsing or lexical analysis on my part, so whatever I do needs to be able to be done through normal syntactical sugar.
I want it to read somewhat like a sentence would, so that it is easy to read and learn, especially for those that aren't especially fluent with programming, but I also want the full functionality of native code available to the user.
For example, in the perfect world it would look like a natural language (English in this case):
When an enemy is within 10 units of player, the enemy attacks the player
In C#, allowing a sentence like this to actually do what the scripter intends would almost certainly require that this be a string that is run through a parser and lexical analyzer. My goal isn't that I have something this natural, and I don't want the scripter to be using strings to script. I want the scripter to have full access to C#, and have things like syntax highlighting, intellisense, debugging in IDE, etc. So what I'm trying to get it something that reads easily, but is in native C#. A couple of the major hurdles that I don't see a way to overcome is getting rid of periods ., commas ,, and parentheses for empty methods (). For example, something like this is feasible but doesn't read very cleanly:
// C#
When(Enemy.Condition(Conditions.isWithinDistance(Enemy, Player, 10))), Event(Attack(Enemy, Player))
Using a language like Scala you can actually get much closer, because periods and parentheses can be replaced by a single whitespace in many cases. For example, you could take the above statement and make it look something like this in Scala:
// Scala
When(Enemy is WithinDistance(Player, 10)) => Then(Attack From(Enemy, Player))
This above code would actually compile assuming you setup your engine to handle it, in fact you might be able to coax further parentheses and commas out of this. Without the syntactical sugar in the above example it would be more like this, in Scala:
// Scala (without syntactical sugar)
When(Enemy.is(WithinDistance(Player, 10)) => Then(Attack().From(Enemy, Player))
The bottom line is I want to get as close as possible to something like the first scala example using native C#. It may be that there is really nothing I can do, but I'm willing to try any tricks that may be possible to make it read more natural, and get the periods, parentheses, and commas out of there (except when they make sense even in natural language).
I'm not as experienced with C# as other languages, so I might not know about some syntax tricks that are available, like macros in C++. Not that macros would actually be a good solution, they would probably cause more problems then they would solve, and would be a debugging nightmare, but you get where I'm going with this, at least in C++ it would be feasible. Is what I'm wanting even possible in C#?
Here's an example, using LINQ and Lambda expressions you can sometimes get the same amount of work done with fewer lines, less symbols, and code the reads closer to English. For example, here's an example of three collisions that happen between pairs of objects with IDs, we want to gather all collisions with the object that has ID 5, then sort those collisions by the "first" ID in the pair, and then output the pairs. Here is how you would do this without LINQ and/or Lambra expressions:
struct CollisionPair : IComparable, IComparer
{
public int first;
public int second;
// Since we're sorting we'll need to write our own Comparer
int IComparer.Compare( object one, object two )
{
CollisionPair pairOne = (CollisionPair)one;
CollisionPair pairTwo = (CollisionPair)two;
if (pairOne.first < pairTwo.first)
return -1;
else if (pairTwo.first < pairOne.first)
return 1;
else
return 0;
}
// ...and our own compable
int IComparable.CompareTo( object two )
{
CollisionPair pairTwo = (CollisionPair)two;
if (this.first < pairTwo.first)
return -1;
else if (pairTwo.first < this.first)
return 1;
else
return 0;
}
}
static void Main( string[] args )
{
List<CollisionPair> collisions = new List<CollisionPair>
{
new CollisionPair { first = 1, second = 5 },
new CollisionPair { first = 2, second = 3 },
new CollisionPair { first = 5, second = 4 }
};
// In a script this would be all the code you needed, everything above
// would be part of the game engine
List<CollisionPair> sortedCollisionsWithFive = new List<CollisionPair>();
foreach (CollisionPair c in collisions)
{
if (c.first == 5 || c.second == 5)
{
sortedCollisionsWithFive.Add(c);
}
}
sortedCollisionsWithFive.Sort();
foreach (CollisionPair c in sortedCollisionsWithFive)
{
Console.WriteLine("Collision between " + c.first +
" and " + c.second);
}
}
And now the same example with LINQ and Lambda. Notice in this example we don't have to both with making CollisionPair both IComparable and IComparer, and don't have to implement to the Compare and CompareTo methods:
struct CollisionPair
{
public int first;
public int second;
}
static void Main( string[] args )
{
List<CollisionPair> collisions = new List<CollisionPair>
{
new CollisionPair { first = 1, second = 5 },
new CollisionPair { first = 2, second = 3 },
new CollisionPair { first = 5, second = 4 }
};
// In a script this would be all the code you needed, everything above
// would be part of the game engine
(from c in collisions
where ( c.first == 5 || c.second == 5 )
orderby c.first select c).ForEach(c =>
Console.WriteLine("Collision between " + c.first +
" and " + c.second));
}
In the end we're left with a LINQ and Lambda expression that read closer to natural language, and are much less code for both a game engine and for the script. These kinds of changes are really what I'm looking for, but obviously LINQ and Lambda are both limited to specific syntax, not something as generic as I would like in the end.
Another approach would be to use FluentInterface "pattern", implement something like:
When(enemy).IsWithin(10.units()).Of(player).Then(enemy).Attacks(player);
If you make the functions like When, IsWithin, Of, Then return some interfaces, then you will be able easily add new extension methods to expand your rules language.
For example let's take a look at function Then:
public IActiveActor Then(this ICondition condition, Actor actor) {
/* keep the actor, etc */
}
public void Attacks(this IActiveActor who, Actor whom) {
/* your business logic */
}
In the future it would be easy to implement another function, say RunAway() without changing anything in your code:
public void RunAway(this IActiveActor who) {
/* perform runaway logic */
}
so it with this little addition you will be able to write:
When(player).IsWithin(10.units()).Of(enemy).Then(player).RunAway();
Same for conditions, assuming When returns something like ICheckActor, you can introduce new conditions by simply defining new functions:
public ICondition IsStrongerThan(this ICheckActor me, Actor anotherGuy) {
if (CompareStrength(me, anotherGuy) > 0)
return TrueActorCondition(me);
else
return FalseActorCondition(me);
}
so now you can do:
When(player)
.IsWithin(10.units()).Of(enemy)
.And(player).IsStrongerThan(enemy)
.Then(player)
.Attacks(enemy);
or
When(player)
.IsWithin(10.units()).Of(enemy)
.And(enemy).IsStrongerThan(player)
.Then(player)
.RunAway();
The point is that you can improve your language without experiencing heavy impact on the code you already have.
Honestly I don't think this is a good direction for a language. Take a look at AppleScript sometime. They went to great pains to mimic natural language, and in trivial examples you can write AppleScript that reads like English. In real usage, it's a nightmare. It's awkward and cumbersome to use. And it's hard to learn, because people have a very hard time with "just write this incredibly limited subset of English with no deviations from the set pattern." It's easier to learn real C# syntax, which is regular and predictable.
I don't quite understand your requirement of "written in native C#". Why? Probably you want it to be written in native .NET? I can understand this as you can compile these rules written in "plain English" into .NET with no parsing etc. Then your engine (probably written in C#) will be able to use these rules, evaluate them, etc. Just because it is all .NET, doesn't really matter which language developer used.
Now, if C# is not really a requirement, then we can stop figuring out how to make "ugly-ugly" syntax look "just ugly" :)
We can look at, for example, F#. It compiles into .NET in the same way C# or VB.NET do, but it is more suitable for solving problems like yours.
You gave us 3 (ugly looking) examples in C# and Scala, here is one in F# I managed to write from the top of my head in 5 minutes:
When enemy (within 10<unit> player) (Then enemy attacks player)
I only spent 5 minutes, so probably it can be even prettier.
No parsing is involved, When, within, Then, attacks are just normal .NET functions (written in F#).
Here is all the code I had to write to make it possible:
[<Measure>] type unit
type Position = int<unit>
type Actor =
| Enemy of Position
| Player of Position
let getPosition actor =
match actor with
| Enemy x -> x
| Player x -> x
let When actor condition positiveAction =
if condition actor
then positiveAction
else ()
let Then actor action = action actor
let within distance actor1 actor2 =
let pos1 = getPosition actor1
let pos2 = getPosition actor2
abs (pos1 - pos2) <= distance
let attacks victim agressor =
printfn "%s attacks %s" (agressor.GetType().Name) (victim.GetType().Name)
This is really it, not hundreds and hundreds of lines of code you would probably write in C# :)
This is a beauty of .NET: you can use appropriate languages for appropriate tasks. And F# is a good language for DLS (just what you need here)
P.S. You can even define functions like "an", "the", "in", etc to make it look more like English (these functions will do nothing but return their first argument):
let an something = something
let the = an
let is = an
Good luck!

Is there any plugin for VS or program to show type and value etc... of a C# code selection?

What I want to do is be told the type, value (if there is one at compile-time) and other information (I do not know what I need now) of a selection of an expression.
For example, if I have an expression like
int i = unchecked((short)0xFF);
selecting 0xFF will give me (Int32, 255), while selecting ((short)0xFF) will give me (Int16, 255), and selecting i will give me (Int32, 255).
Reason why I want such a feature is to be able to verify my assumptions. It's pretty easy to assume that 0xFF is a byte but it is actually an int. I could of course refer to the C# Language Specifications all the time, but I think it's inefficient to have to refer to it everytime I want to check something out. I could also use something like ANTLR but the learning curve is high.
I do intend to read the entire specs and learn ANTLR and about compilers, but that's for later. Right now I wish to have tools to help me get the job done quickly and accurately.
Another case in point:
int? i = 0x10;
int? j = null;
int x;
x = (i >> 4) ?? -1;//x=1
x = (j >> 4) ?? -1;//x=-1
It may seem easy to you or even natural for the bottom two lines in the code above. (Maybe one should avoid code like these, but that's another story) However, what msdn says about the null-coalescing operator is lacking information to tell me that the above code ((i>>4)??) is legal (yet it is, and it is). I had to dig into grammar in the specs to know what's happening:
null-coalescing-expression
conditional-or-expression
conditional-and-expression
exclusive-or-expression
and-expression
equality-expression
relational-expression
shift-expression
shift-expression right-shift additive-expression
... (and more)
Only after reading so much can I get a satisfactory confirmation that it is valid code and does what I think it does. There should be a much simpler way for the average programmer to verify (not about validity, but whether it behaves as thought or not, and also to satisfy my curiosity) such code without having to dive into that canonical manual. It doesn't necessary have to be a VS plugin. Any alternative that is intuitive to use will do just as well.
Well, I'm not aware of any add-ins that do what you describe - however, there is a trick you can use figure out the type of an expression (but not the compile-time value):
Assign the expression to a var variable, and hover your mouse over the keyword var.
So for example, when you write:
var i = unchecked((short)0xFF);
and then hover your mouse over the keyword var, you get a tooltip that says something like:
Struct System.Int16
Represents a 16-bit signed integer.
This is definitely a bit awkward - since you have to potentially change code to make it work. But in a pinch, it let's you get the compiler to figure out the type of an expression for you.
Keep in mind, this approach doesn't really help you once you start throwing casts into the picture. For instance:
object a = 0xFF;
var z = (string)a; // compiles but fails at runtime!
In the example above, the IDE will dutifully report that the type of var z is System.String - but this is, of course, entirely wrong.
Your question is a little vague on what you are looking for, so I don't know if "improved" intellisense solves it, but I would try the Productivity Power Tools.

Semicolons in C#

Why are semicolons necessary at the end of each line in C#?
Why can't the complier just know where each line is ended?
The line terminator character will make you be able to break a statement across multiple lines.
On the other hand, languages like VB have a line continuation character (and may raise compile error for semicolon). I personally think it's much cleaner to terminate statements with a semicolon rather than continue using undersscore.
Finally, languages like JavaScript (JS) and Swift have optional semicolon(s), but at least JS has a convention to always put semicolons (even if not required, which prevents accidents).
No, the compiler doesn't know that a line break is for statement termination, nor should it. It allows you to carry a statement to multilines if you like.
See:
string sql = #"SELECT foo
FROM bar
WHERE baz=42";
Or how about large method overloads:
CallMyMethod(thisIsSomethingForArgument1,
thisIsSomethingForArgument2,
thisIsSomethingForArgument2,
thisIsSomethingForArgument3,
thisIsSomethingForArgument4,
thisIsSomethingForArgument5,
thisIsSomethingForArgument6);
And the reverse, the semi-colon also allows multi-statement lines:
string s = ""; int i = 0;
How many statements is this?
for (int i = 0; i < 100; i++) // <--- should there be a semi-colon here?
Console.WriteLine("foo")
Semicolons are needed to eliminate ambiguity.
So that whitespace isn't significant except inside identifiers and keywords and such.
I personally agree with having a distinct character as a line terminator. It makes it much easier for the compiler to figure out what you are trying to do.
And contrary to popular belief it is not possible 100% of the time to for the compiler to figure out where one statement end and another begins without assistance! There are edge cases where it is ambiguous whether it is a single statement or multiple statements spanning several lines.
Read this article from Paul Vick, the technical lead of Visual Basic to see why its not as easy as it sounds.
Strictly speaking, this is true: if a human could figure out where a statement ends, so can the compiler. This hasn't really caught on yet, and few languages implement anything of that kind. The next version of VB will probably be the first language to implement a proper handling of statements that require neither explicit termination nor line continuation [source]. This would allow code like this:
Dim a = OneVeryLongExpression +
AnotherLongExpression
Dim b = 2 * a
Let's keep our fingers crossed.
On the other hand, this does make parsing much harder and can potentially result in poor error messages (see Haskell).
That said, the reason for C# to use a C-like syntax was probably due to marketing reasons more than anything else: people are already familiar with languages like C, C++ and Java. No need to introduce yet another syntax. This makes sense for a variety of reasons but it obviously inherits a lot of weaknesses from these languages.
It can be done. What you refer to is called "semicolon insertion". JavaScript does it with much success, the reason why it is not applied in C# is up to its designers. Maybe they did not know about it, or feared it might cause confusion among programmers.
For more details in semicolon insertion in JavaScript, please refer to the ECMA-script standard 262 where JavaScript is specified.
I quote from page 22 (in the PDF, page 34):
When, as the program is parsed from left
to right, the end of the input
stream of tokens is encountered and
the parser is unable to parse the
input token stream as a single complete
ECMA Script Program,
then a semicolon isa utomatically inserted at
the end of the input stream.
When, as
the program is parsed from left to right,
a token is encountered that is
allowed by some production of
the grammar, but
the production is a restricted production and the token would be the
first token for a terminal or
nonterminal immediately following the
annotation “[no LineTerminator
here]” with in the restricted production (and there fore such a token is
called a restricted token), and the
restricted token is separated fromt he
previous token by at least one
LineTerminator, then a
semicolon is automatically inserted before the restricted token.
However, there is an additional
overriding condition on the preceding
rules: a semicolon is never
inserted automatically if
the semicolon would then be parsed as an empty statement
or if that semicolon
would become one of the two semicolons in the header of a for statement
(section 12.6.3).
[...]
The specification document even contains examples!
Another good reason for semicolons is to isolate syntax errors. When syntax errors occur the semicolons allow the compiler to get back on track so that something like
a = b + c = d
can be disambiguated between
a = b + c; = d
with the error in the second statement or
a = b + ; c = d
with the error in the first statement. Without the semicolons, it can be impossible to say where a statement ends in the presence of a syntax error. A missing parenthesis might mean that the entire latter half of your program may be considered one giant syntax error rather than being syntax checked line by line.
It also helps the other way - if you meant to write
a = b; c = d;
but typoed and left out the "c" then without semis it would look like
a = b = d
which is valid and you'd have a running program with a bad and difficult to locate bug so semicolons can often help catch errors that otherwise would look like valid syntax. Also, I agree with everybody on readability. I don't like working in languages without some sort of statement terminator for that reason.
I've been mulling this question a bit and if I may take a guess at the motivations of the language designers:
C# obviously has semicolons because of its heritage from C. I've been rereading the K&R book lately and it's pretty obvious that Dennis Ritchie really didn't want to force programmers to code the way he thought was best. The book is rife with comments like, "Although we are not dogmatic about the matter, it does seem that goto statements should be used rarely, if at all" and in the section on functions they mention that they chose one of many format styles, it doesn't matter which one you pick, just be consistent.
So the use of an explicit statement terminator allows the programmer to format their code however they like. For better or worse, it seems consistent with how C was originally designed: do it your way.
I would say that the biggest reason that semicolons are necessary after each statement is familiarity for programmers already familiar with C, C++, and/or Java. C# inherits many syntactical choices from those languages and is not simply named similarly to them. Semicolon-terminated statements is just one of the many syntax choices borrowed from those languages.
Semi-colons are a remnant from the C language, when programmers often wanted to save space by combining statements on one line. i.e.
int i; for( i = 0; i < 10; i++ ) printf("hello world.\n"); printf("%d instance.\n", i);
It also helped the compiler, which was not smart enough to simply infer the end of a statement. In almost all cases, combining statements on one line is not looked favorably upon by most c# developers for readability reasons. The above is typically written like so:
int i;
for( i = 0; i < 10; i++ )
{
printf("hello world.\n);
printf("%d instance.\n", i);
}
Very verbose! For modern languages, compilers can easily be developed to infer end of statements. C# could be altered into another language which uses no unnecessary delimiters other than a space and indenting tab, i.e.
int i
for i=0 i<10 i++
printf "hello world.\n"
printf "%d instance.\n" i
That would certainly save some typing and it looks neater. If indents are used rather than spaces, the code becomes much more readable. We can do one better if we allow types to be inferred and make a special case of for, to read, (for [value]=[initial value] to [final value:
for i=1 to 10 // i is inferred to be an integer
printf "hello world.\n"
printf "%d instance.\n" i
Now, its beginning to look like f# and f#, in some ways, is almost like c# without the unnecessary punctuation. However f# lacks so many extras (like special .NET language constructs, code completion and good intellisense). So, in the end f# can be more work than c# or VB.NET to implement, sadly.
Personally, my work required VB.NET and I have been happier not having to deal with semi-colons. C# is a dated language. Linq has allowed me to cut down on the number of lines of code I have to write. Still, if I had the time, I would write a version of c# which had many of the features of f#.
You could accurately argue that requiring a semicolon to terminate a statement is superfluous. It is technically possible to remove the semicolon from the C# language and still have it work. The problem is that it leaves room for misinterpretation by humans. I would argue that the necessity of semicolons is the disambiguation for the sake of humans, not the compiler. Without some form of statement delimitation, it is much harder for humans to interpret consise staements such as this:
int i = someFlag ? 12 : 5 int j = i + 3
The compiler should be able to handle this just fine, but to a human the below looks much better
int i = someFlag ? 12 : 5; int j = i + 3;

Categories

Resources