Irony: How to give KeyTerm precedence over variable? - c#

Relevant chunk of Irony grammar:
var VARIABLE = new RegexBasedTerminal("variable", #"(?-i)\$?\w+");
variable.Rule = VARIABLE;
tag_blk.Rule = html_tag_kw + attr_args_opt + block;
term_simple.Rule = NUMBER | STRING | variable | boolean | "null" | term_list;
term.Rule = term_simple | term_filter;
block.Rule = statement_list | statement | ";";
statement.Rule = tag_blk | directive_blk | term;
The problem is that both a "tag" and a "variable" can appear in the same place. I want my parser to prefer the tag over the variable, but it always prefers the variable. How can I change that?
I've tried changing tag_blk.Rule to PreferShiftHere() + html_tag_kw + attr_args_opt + block; and ImplyPrecedenceHere(-100) + html_tag_kw + attr_args_opt + block; but it doesn't help any. The parser doesn't even complain of an ambiguity.

Try changing the order of 'tag_blk.Rule' and 'variable.Rule' as tokenisers usually go after first match, and variable is first in your list.

You can increase the Priority of the tag_blk Terminal or decrease the one of variable whichever suits your purpose. Terminal class has a Priority field defaulting to 0. According to the comment right above it
// Priority is used when more than one terminal may match the input char.
// It determines the order in which terminals will try to match input for a given char in the input.
// For a given input char the scanner uses the hash table to look up the collection of terminals that may match this input symbol.
// It is the order in this collection that is determined by Priority property - the higher the priority,
// the earlier the terminal gets a chance to check the input.
Unfortunately I can't test this at the moment as the code fragment provided needs work and lots of assumptions to be made compilable. But from the description above this should be the one you are looking for. Hope it helps someone -even 10 years after the question aired.

Related

Forward-Only Seeking Lookup Algorithm

Given a string in the format {Length}.{Text} (such as 3.foo), I want to determine which string, from a finite list, the given string is.
The reader starts at the 0-index and can seek forward (skipping characters if desired).
As an example, consider the following list:
10.disconnect
7.dispose
7.distort
The shortest way to determine which of those strings has been presented might look like:
if (reader.Current == "1")
{
// the word is "disconnect"
}
else
{
reader.MoveForward(5);
if (reader.Current == "p")
{
// the word is "dispose"
}
else
{
// the word is "distort"
}
}
The question has 2 parts, though I hope someone can just point me at the right algorithm or facet of information theory that I need to read more about.
1) Given a finite list of strings, what is the best way to generate logic that requires the least number of seeks & comparisons, on average, to determine which word was presented?
2) As with the first, but allowing weighting such that hotpaths can be accounted for. i.e. if the word "distort" is 4 times more likely than the words "disconnect" and "dispose", the logic shown above would be more performant on average if structured as:
reader.MoveForward(5);
if (reader.Current == "t")
{
// the word is distort
}
else //...
Note: I'm aware that the 6th character in the example set is unique so all you need to do to solve the example set is switch on that character, but please assume there is a longer list of words.
Also, this isn't some homework assignment - I'm writing a parser/interception layer for the Guacamole protocol. I've looked at Binary Trees, Tries, Ulam's Game, and a few others, but none of those fit my requirements.
I dont know if this would be of any help, but I'll throw my 5 cents in anyway.
What about a tree that automatically gets more granular as you have more strings in the list, and checking of the existing leaves are done with respect to "hotpaths"?
for example, I would have something like this with your list:
10.disconnect
7.dispose
7.distort
root ---- 7 "check 4th letter" ------ if "t" return "distort"
| "in the order of " |
| " hot paths " --- if "p"return "dispose"
|
----10 ---- return "disconnect"
you can have this dynamically build up. for example if you add 7.display it would be
root ---- 7 "check 4th letter" ------ if "t" return "distort"
| "in the order of " |
| " hot paths " --- if "p" --- "check 5th letter" --- if "o" ...
| |
----10 ---- return "disconnect" --- if "l" ...
so nodes in the tree would have a variable "which index to check", and leaves corresponding to possible results (order is determined statistically). so something like:
# python example
class node():
def __init__(which_index, letter):
self.which_index = which_index # which index this node checks to determine next node
self.letter = letter # for which letter we go to this node
self.leaves = SomeLinkedList()
def add_leaf(node):
self.leaves.putInCorrectPositionDependingOnHowHotPathItIs(node)
def get_next(some_string):
for leaf in self.leaves:
if some_string[self.which_index] == leaf.letter:
return leaf
raise Exception("not found")
another alternative is of course hashing.
But if you are micro-optimizing, it is hard to say as there are other factors that come into play (eg. probably time you save from memory caching would be very significant).

If statement not working as expected on combined enum value

This is a quirky one.
I have the following code...
foreach (IScanTicket ticket in this) {
if (ticket.Status == TicketStatus.Created || ticket.Status == (TicketStatus.Transfered | TicketStatus.Created))
return ticket;
}
}
When I run this, where the status is Created|Transferred, the if statement seems to fail (not do what it's suppose to).
The interesting thing is that if I debug and step through the code and watch the statement, it always returns TRUE in my debugger, yet it fails to enter the block when I step through the code.
Why would the debugger show that the statement is true, yet continue like it's not? It's like what the debugger is telling me fibs.
Has anyone ever experienced this?
P.S. I'm using Xamarin studio 5.9.7
Too long for a comment:
Actually, the [Flags] attribute does not change an enum's semantics at all, it's most popularly used by the ToString method to emit a series of names rather than a number for a combined value.
Let's say your enum was declared like this (without the Flags attribute):
enum TicketStatus
{
Created = 1,
Transferred = 2,
Sold = 4
}
You could still combine different members and do any arithmetic that applies to a Flags enum:
TicketStatus status = TicketStatus.Created | TicketStatus.Transferred;
However, the following will print 3:
Console.WriteLine(status);
But if you add the [Flags] attribute, it will print Created, Transferred.
Also, it's important to note that by TicketStatus.Created | TicketStatus.Transferred you're really doing a bitwise OR on the underlying integer value, notice how in our example that the assigned values are unambiguously combinable:
Created : 0001
Transferred: 0010
Sold: 0100
Therefore a value of 3 can be unambiguously determined as a combination of Created and Transferred. However if we had this:
enum TicketStatus
{
Created = 1, // 0001
Transferred = 2, // 0010
Sold = 3, // 0011
}
As it is obvious by the binary representations, combining values and checking against members is problematic as combined members could be ambiguous. e.g. what is status here?
status = TicketStatus.Created | TicketStatus.Transferred;
Is it Created, Transferred or is it really Sold? However, the compiler won't complain if you try to do it, which could lead to hard to track down bugs like yours, where some check is not working as you expect it to, so it's on you to ensure the definition is sane for bitwise mixing and comparing.
On a related note, since your if statement is really only checking if the ticket has a Created status, regardless of being combined with other members, here's a better way to check for that (.NET >= 4):
status.HasFlag(TicketStatus.Created)
or (.NET <4):
(status & TicketStatus.Created) != 0
As to why your enum did not work as expected, it is almost certainly because you did not explicitly specify unambigously bitwise combinable values to its members (typically powers of two).
Thanks to #MarcinJuraszek and #YeldarKurmangaliyev.
Seems the [Flags] attribute wasn't set on the enum as I originally thought. Adding this attribute now makes the enum work in either combination.
So it seems that not having this attribute effects the order of joined enum values.

How to stop Resharper from line breaking after return keyword for long lines?

When I auto format with Resharper CTRL + ALT + SHIFT + F for lines longer than max line length (in my case say it's 80 characters), I get the following:
return
View(new ViewModel
{
Identifier = identifier,
Files = service.AllFiles()
});
But what I really want is it not to wrap after the "return" keyword (i.e. not have the return keyword on a line all on its own), like so:
return View(new ViewModel
{
Identifier = identifier,
Files = service.AllFiles()
});
Does anyone know how to "configure" Resharper to make this happen? :)
Here's another example, here's what I'm seeing now:
return
repository.Session.CreateCriteria(typeof(SomeType))
.Add(Expression.Eq("Identifier", identifier))
.UniqueResult<SomeType>();
When I really want to see:
return repository.Session.CreateCriteria(typeof(SomeType))
.Add(Expression.Eq("Identifier", identifier))
.UniqueResult<SomeType>();
UPDATE:
Here is "chop always":
return View(new OrganisationFileLoadViewModel
{
Identifier = identifier,
AllExistingOrganisationFiles = nmdsOrganisationFileLoadService.AllNMDSOrganisationFiles()
});
Here is "chop if long":
return
View(new OrganisationFileLoadViewModel
{
Identifier = identifier,
AllExistingOrganisationFiles = nmdsOrganisationFileLoadService.AllNMDSOrganisationFiles()
});
Resharper -> Options -> (Code Editing) C# -> Formatting Style -> Line Breaks and Wrapping
There are a lot of settings for line wrapping. The default for Wrap long lines is normally 120 characters. This may be triggering your break since you are set to 80 or Resharper 8.0 may have a newer option for return. The path above is for 7.0, but I believe it is the same or at least similar to 8.0.
The nice is that they show you examples for the changes you make so you don't have to test it right away.
There is no special option to turn "wrapping after return" OFF.
1) I was not able to reproduce a similar code formatting as shown in the first code snippet. However, I recommend you trying to change this setting to "Simple Wrap":
ReSharper | Options | Code Editing | C# | Formatting Style | Line Breaks and Wrapping | Line Wrapping | Wrap invocation arguments.
2) In my case, the following changing helps me: ReSharper | Options | Code Editing | C# | Formatting Style | Line Breaks and Wrapping | Line Wrapping | Wrap chained method calls | Select "Chop always".

Preventing overlap of number ranges

I failed at this problem for several hours now and just can't get my head around it. It seems fairly simple from a "human" POV, but somehow I just can't seem able to write it into code.
Situation: Given several number ranges that are defined by a starting number and the current "active" number which are assigned to specific locations (or 0 for generic ones)
startno | actualno | location
100 | 159 | 0
200 | 203 | 1
300 | 341 | 2
400 | 402 | 0
Now, as you can see, there can also be two ranges for one location. In this case, only the range with the highest startno (in this case, 400) is regarded as active, the other one only exists for history purposes.
Every user is assigned to a specific location (the same IDs as in the location column), but never to a generic one (zero).
When a used wants a new number, he will get a number assigned from a range that is assigned to his location, or, if none is found, from the highest generic one (e.g. user.location = 0 would get 403, user.location = 2 would get 342).
Then, the user can select to either use this number or an amount X starting from the assigned number.
Here comes the question: How can I assure that the ranges don't overlap into each other? Say the user (location = 2) gets the next number 342 and decides he needs 100 numbers following that. This would produce the end number to 441, which is inside the generic range, which mustn't happen.
I tried around with several nested SELECTs, using both the starting and ending number, aggregating MAX(), JOINing the table on itself, but I just can't get it 100% right.
From my understanding with such a thing I may just create a trigger on the table in db to do the validation and raise an error if overlap found while the application update the table, so that user will just simply get an error saying you can't do it. Say if you want it end with 441 then just let user do it and try to update the table with actualno to 441, then a simple select compare the new number to all existing startno see if it's bigger than any startno then raise the error. Something like following in the update trigger:
IF EXISTS(SELECT 1 FROM
Table1
WHERE #newnumber >= startno AND id <> #currentID)
BEGIN
'Go Raise the error
END
Well maybe I missed something here in some certain case this won't work and please let me know.
Using trigger for data integrity check is totally OK and shouldn't be a problem at all. This would be much easier than validation ahead especially if you think about multithreading stuff might create some big problem there.
In the other hand, for prevent this happened too easy, I might just add couple more zero into those numbers as initial values:
startno | actualno | location
100000 | 100059 | 0
200000 | 200003 | 1
300000 | 300041 | 2
400000 | 400002 | 0
As so often, I found an approach not long after posting the question. It seems describing a problem so other people understand it is half-way to getting the solution. At least, I got a possible one which so far proofed to be quite resistant.
I query the database with
SELECT nostart FROM numbers
WHERE nostart BETWEEN X AND Y
where X is the start number requested and Y is the end number of the user. (To be conform with my introduction example, X = 342 and Y = 441
This will then give me a list of all ranges whose starting number is inside the range of the numbers the user requested, in this case the list would be
nostart
400
Now, if the query doesn't find a result, I'm golden and the numbers can be used. If the query finds a single result, and that result is equal to the starting number of the user, I'm also OK because this means it's the first time a user requested something from this range.
If that is not the case, the range cannot be used, because another range is inside it. Also, if the query finds multiple results (e.g. for X = 100 and Y = 350, which would result in 100|200|300 I also deny the request, because several ranges are overlapped.
If anyone has a better solution or notes on this one, I'll leave this here and use it as long as it works out.

String formatting does not behave as expected when using padding

I am trying to form a text representation of a table based off of the result coming in from a SqlDataReader.
while (unitsRdr.Read())
{
note.AppendFormat("{0,-15}", String.Format("{0}({1})", unitsRdr.GetValue(0), unitsRdr.GetValue(1)));
}
Now what I expect to happen is I should get 4 sets of items, that will have padding on the right hand side equal to 15. Like this
BP(mm\Hg) HR(bpm) RR(rpm) Sa O2(%) |<-- this would be the end of the string.
However what I am getting is
BP(mm\Hg )HR(bpm )RR(rpm )Sa O2(% )|<-- this is the end of the string.
It appears to start counting after the ( and the ) is put after the number of spaces.
What is the correct way of doing formatting so the text is like my first example?
For anyone who wants it here is the source table
desc unit
------------ ----------
BP mm\Hg
HR bpm
RR rpm
Sa O2 %
I strongly suspect that the problem is that the value returned by unitsRdr.GetValue(1) is actually longer than you believe. I very much doubt that string.Format is doing that. Try using unitsRdr.GetValue(1).Trim() instead - or just test it with a hard-coded literal value.

Categories

Resources