Strange parsing behavior of single closing braces using Roslyn-CTP - c#

We're writing a code generator with roslyn. Among other things the user should be able to specify the single statements of a method body or the body of a getter/setter of a property. Therefore he passes a list of strings to a translation method. When passing single curly braces as statements the closing brace gets swallowed somehow.
The method:
internal static SyntaxList<StatementSyntax> GetSyntaxListOfStatementSyntaxs(IEnumerable<string> statements)
{
if (statements.Any())
{
var statementSyntaxs = statements.Select(s => Syntax.ParseStatement(s));
return Syntax.List(statementSyntaxs);
}
return Syntax.List<StatementSyntax>();
}
The input:
var list = new List<string>
{
"if (this.field != null)",
"{",
"this.field = new MyType();",
"}",
"return this.field;"
};
The SyntaxList would be used in a new method declaration (last parameter):
var methodDeclarationSyntax = Syntax.MethodDeclaration(
Syntax.List<AttributeDeclarationSyntax>(),
Syntax.TokenList(),
Syntax.IdentifierName("MyType"),
null,
Syntax.Identifier("MethodIdentifier"),
null,
Syntax.ParameterList(),
Syntax.List<TypeParameterConstraintClauseSyntax>(),
Syntax.Block(statementSyntaxList));
I also tried to process the single closing brace separately but I didn't manage to create a statement with only one closing brace.
The weird thing is the single opening brace gets parsed as a syntax block (correctly or not) but it seems impossible to create that syntax block manually. Neither for the opening nor for the closing brace.
I don't want to add custom parsing of these statements because we decided for Roslyn to be free of parsing tasks. Does someone know how to deal with these special statements? Or maybe somebody can come with another way to treat this issue. Any help appreciated. Thanks in advance.

The problem is that neither an opening brace nor a closing brace are statements, so you can't parse them as such.
Roslyn tries to parse even invalid code, which is why you're getting a BlockSyntax when you parse {. But it's an incomplete block, with the closing brace missing.
I think you should parse the whole method body at once. You could do that by joining the lines together into one string and adding an opening and closing brace.
So, the string that you would actually parse as a statement would look like:
{
if (this.field != null)
{
this.field = new MyType();
}
return this.field;
}

Related

Do C# results vary based on curly brace placement?

We are currently using C# and want to know if C# bracket placements can change the results.
In Javascript, it matters as results vary based on the curly brace placement .
Why do results vary based on curly brace placement?
In JS they should be kept on the same line, if there are problems with browsers incorrectly interpretting it.
if (x == a)
{
...
}
if (x == a) {
...
Does bracket placement matter for C#?
No, they don't.
In JavaScript, you can write code without ending your lines of code with a semicolon, and JavaScript will automatically fill in the missing semicolons when it interprets your code. That's what this answer to the question you linked is essentially stating. That is to say: the brace placement isn't the real issue in JS; it's the ability to write code with/without semicolons and have JS automatically fill these in for you. The brace placement issue is more of a side effect of this functionality.
In C#, a "line" doesn't end until the semicolon is reached (even if that "line" spans multiple physical lines), and writing code without semicolons isn't something that is automagically taken care of for you by the compiler; it will simply fail to compile. The brace placement in C# therefore is unimportant.

Curly brackets vs no curly brackets in lambda expression c#

I am pretty new to C#, around 1 year experience. Recently got introduced to lambda expressions. I want to have an Action<string> which would display an Error with custom Error text to a MessageBox. I am wondering, what is the difference between:
public static Action<string> Error = s => { MessageBox.Show(s, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error); };
and
public static Action<string> Error = s => MessageBox.Show(s, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
Thanks for any helpful advises :)
According to official C# language specification:
8.2 Blocks
A block permits multiple statements to be written in contexts where a single statement is allowed.
block:
{ statement-listopt }
A block consists of an optional statement-list (§8.2.1), enclosed in braces.
If the statement list is omitted, the block is said to be empty.
A block may contain declaration statements (§8.5).
The scope of a local variable or constant declared in a block is the block.
Within a block, the meaning of a name used in an expression context must always be the same (§7.6.2.1).
A block is executed as follows:
• If the block is empty, control is transferred to the end point of the block.
• If the block is not empty, control is transferred to the statement list.
When and if control reaches the end point of the statement list, control is transferred to the end point of the block.
The statement list of a block is reachable if the block itself is reachable.
The end point of a block is reachable if the block is empty or if the end point of the statement list is reachable.
The difference is only syntactical, it has no impact on the code that executes. The same thing is compiled using either notation.
After an =>, you may either write a block statement, which is surrounded by { and }. You may also write a single expression 'line of code' and omit the curly braces, to prevent boilerplate curly braces.

Curly Braces not recognised when using ASP.NET MVC Razor in VS2015

I've hit a strange problem in VS2015 in a cshtml Razor view. I think it must be a Razor bug, but would appreciate a sanity check and suggestions for a workaround if poss.
This issue is that curly braces within a #{} code block are not resolving properly. I have an if statement within a foreach loop and if I use curly braces to surround the if action, I get a compilation error. Interestingly enough, curly braces after the else statement seem fine.
This would be easier to demonstrate with VS colour coding, but here goes.
This works:
#{
var nestLevel = 0;
foreach (var t in xmlHelper.GetProperties(asm, ViewBag.xmlType, "root"))
{
var type = #t.Item3.PropertyType;
if (xmlHelper.builtInTypes.Contains(type.ToString()))
<p>#t.Item1 #t.Item2 #t.Item3.PropertyType</p>
else
{
nestLevel++;
}
}
} //VS shows the #{} code block ending here as expected
However, if I now add curly braces around the if action, it won't compile:
#{
var nestLevel = 0;
foreach (var t in xmlHelper.GetProperties(asm, ViewBag.xmlType, "root"))
{
var type = #t.Item3.PropertyType;
if (xmlHelper.builtInTypes.Contains(type.ToString()))
{
<p>#t.Item1 #t.Item2 #t.Item3.PropertyType</p>
}
else
{
nestLevel++;
}
} //VS now incorrectly shows the #{} code block ending here
}
Any thoughts/suggestions?
Remove the # from this line:
var type = #t.Item3.PropertyType;
You're already in a C# code area, so you don't need the # to reference variables as you would if you were in a HTML area.
It's okay to do that on the line below, because when you start a line with recognised HTML it assumes a switch back to HTML, and breaks out of C#. So you're effectively in a HTML section there.
<p>#t.Item1 #t.Item2 #t.Item3.PropertyType</p>
Just as an aside, I often end us using this shortcut, when I want to force HTML mode so I can output the value of a variable.
#if (t.isExample)
{
#: #t.OutputThis
}
when you use # token on razor views, you are telling the engine to stop parsing html, and try to interpret the C# language, razor is clever enough to know when to stop the parsing because the token you are trying to write is part of the html language, so you need to specify at which point razor should read your C# code, thats why on your if statement you need to turn on the recognition again with the # token. If you want to escape the # token, because you are trying to write # as html, then use ## instead. Hope this helps

About c# if-else syntax

In c# you can define an if statement without using braces, like this example
if (GamePad.GetState(PlayerIndex.One).Buttons.Back == ButtonState.Pressed)
this.Exit();
here the this.Exit(); is the statement associated with the if. But it's not in braces, so my question is, how is it associated with the if?
I learned that the compiler ignores white space, which does not logically make sense in this case. Is the answer simply that the IDE finds the indent and automatically puts it in braces when it compiles?
The ; ends the statement.
Statements (C# Programming Guide)
A statement can consist of a single line of code that ends in a
semicolon, or a series of single-line statements in a block. A
statement block is enclosed in {} brackets and can contain nested
blocks.
When your code is parsed by the compiler, it breaks each section into a lexical block. The syntax of the 'if' statement is:
if ( Expression ) Statement else Statement
or
if ( Expression ) Statement
A statement can either be a statement block (i.e. enclosed in braces) or a single statement. In your code, the this.Exit() call is associated with the if block by virtue of the fact that the expression has been closed and that 'this.Exit()' conforms to the syntax of a statement.
http://ecma-international.org/ecma-262/5.1/
The braces in C# and Java are basically make multiple statements a block-set that basically understood as a scope under a particular situation.
You can put 1 or more statements in curly braces or leave the area blank no matter if you just have some comments in there.
By default compiler seeks every character written in your code, so it goes char by char and when it sees opening { then it expects there must be a closing }. If it finds more opening braces it keeps on counting the code blocks.
If there is no opening { after if/else/foreach/for/do/while then compiler considers any immediate statement as part of its block if terminated by a ;
You can even have no statement after your if/else/foreach/for/do/while if you immediately put a ;
I have my finding, may be many people already know it or using it, so by the virtue of this question I am putting forward...
There can be several uses of { } blocks. In all loops, if-else statements, and even in switch-case you can use braces to put a code in a scope. For me its really very helpful to put the case statements in blocks. If you define a variable in one case, then you cant define it with the same name in another case under same switch... So I use this syntax:
int abc = 1;
switch (abc)
{
case 1:
{
var x = 11;
}
break;
case 2:
{
var x = 11; // its legal.
}
break;
case 3:
var x = 11; // its ilegal here too.. because we already have it in previous scope.
break;
case 4:
{
var x = 11; // its illegal here because we already have a in the parent/current scope.
}
break;
}
You can also declare variables with same idea:
... some code above
{
var xx = 10;
}
// xx - is not available as it was declared in the inner-scope
{
var xx = 11; // Its legal, because its declared in inner-scope.
}
// xx - is again not available as it was declared in the inner-scope
... some code below
Summary:
If there is no opening brace { after if/else/foreach/for/do/while then the next immediate statement is considered to be the part of if/else/foreach/for/do/while block.
You can create as many scopes within your sequential statements to use same variable names.
It only works for one line, so if you want to have an if statement with multiple lines you should use braces.
So, the compiler knows, that is there is an if statement without braces it should use (given that the condition is true) the next line.
No, braces are for multi-line statements. So the white space and indent is irrelevant. You could have the this.Exit() call on the same line as the if statement, and that would still be fine. Some people still prefer braces for single line statements for readability, and that is a matter of choice.
For a single Line statement in if, there is no need to put statement in braces.
If you need to execute more than one statement, braces are required.
In C#, if statements run commands based on brackets as we use if with brackets. If no brackets are given, it runs the next command if the statement is true and then runs the command after. if the condition is false, just continues on the next command.
" ; " is pointed to end of statement/ Termination point. so when compiler found this first it include it in IF block and not include other in IF Clause.

C# method contents validation

I need to validate the contents of a C# method.
I do not care about syntax errors that do not affect the method's scope.
I do care about characters that will invalidate parsing of the rest of the code. For example:
method()
{
/* valid comment */
/* <-- bad
for (i..) {
}
for (i..) { <-- bad
}
I need to validate/fix any non-paired characters.
This includeds /* */, { }, and maybe others.
How should I go about this?
My first thought was Regex, but that clearly isn't going to get the job done.
You'll need to scope your problem more carefully in order to get a sensible answer.
For example, what are you going to do about methods that contain preprocessor directives?
void M()
{
#if FOO
for(foo;bar;blah) {
#else
while(abc) {
#endif
Blah();
}
}
This is silly but legal, so you have to handle it. Are you going to count that as a mismatched brace or not?
Can you provide a detailed specification of exactly what you want to determine? As we've seen several times on this site, people cannot successfully build a routine that divides two numbers without a specification. You're talking about analysis that is far more complex than dividing two numbers; the code which does what you're describing in the actual compiler is tens of thousands of lines long.
A regex is certainly not the answer to this problem. Regex's are useful tools for certain types of data validation. But once you get into the business of more complicated data like matching braces or comment blocks a regex no longer gets the job done.
Here is a blog article on the limitations encountered when using a regex to validate input.
http://blogs.msdn.com/ianhu/archive/2009/11/16/intellitrace-itrace-files.aspx
In order to do this you will have to write a parser of sorts which does the validation.
A regular expression isn't a very convenient thing for such a task. This is often implemented using a stack with an algorithm like the following:
Create an empty stack S.
While( there are characters left ){
Read a character ch.
If is ch an opening paren (of any kind), push it onto S
Else
If ch is a closing paren (of any kind), look at the top of S.
If S is empty as this point, report failure.
If the top of S is the opening paren that corresponds to c,
then pop S and continue to 1, this paren matches OK.
Else report failure.
If at the end of input the stack S is not empty, return failure.
Else return success.
for more information check http://www.ccs.neu.edu/home/sbratus/com1101/lab4.html and http://codeidol.com/csharp/csharpckbk2/Data-Structures-and-Algorithms/Determining-Where-Characters-or-Strings-Do-Not-Balance/
If you're trying to "validate" the contents of a string defining a method, then you may be better off just trying to use the CodeDom classes and compile the method on the fly into an in memory assembly.
Writing your own fully-functional parser to do validation will be very, very difficult, especially if you want to support C# 3 or later. Lambda expressions and other constructs like that will be very difficult to "validate" cleanly.
You're drawing a false dichotomy between "characters that will invalidating parsing the rest of the code" and "syntax errors". Lacking a closing curly brace (one of the problems you mention) is a syntax error. It looks like you mean you're looking for syntax errors that potentially break scope boundaries? Unfortunately, there's no robust way to do this short of using a full parser.
As an example:
method()
{ <-- is missing closing brace
/* valid comment */
/* <-- bad
for (i..) {
}
for (i..) {
} <-- will be interpreted as the closing brace for the for loop
There's no general, practical way to infer that it's the for loop that's missing its closing brace, rather than the method.
If you're really interested in looking for these sort of things, you should consider running the compiler programmatically and parsing the results - that's the best approach with the lowest entry threshold.

Categories

Resources