Fix indirect left recursive rule operator precedence via delegated operator precedence #4480
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #4460
Operator precedence is currently (incorrectly) reset to zero, when a left-recursive rule contains other (non-left)recursive elements.
The ATN for rule e is shown below:
![ATN of rule e](https://private-user-images.githubusercontent.com/1916723/297397479-65ca6bd2-a153-45b6-9577-68046d9165ef.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgwODcyNzksIm5iZiI6MTcxODA4Njk3OSwicGF0aCI6Ii8xOTE2NzIzLzI5NzM5NzQ3OS02NWNhNmJkMi1hMTUzLTQ1YjYtOTU3Ny02ODA0NmQ5MTY1ZWYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxMSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTFUMDYyMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9Zjc0ZWRiMThhYTk4MWUzZGJlMWNhNjc5YjdlMzA5YjA5MjE0MzY2NWQxM2UyOWNiNGNkZmEwNWMxYmRiZDY5NiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.26ZPkv2YNDV2FbuXPCZe-ScXxr86kl7e7rkNxpsVrDA)
In case no indirect recursion was present (
e : '-' e; | e '*' e * ID;
), the ATN would look like the following:Notably, without indirectness,
<e[2]>
signals a precedence used.For completeness, the ATN of rule m is shown below too:
![ATN of rule m](https://private-user-images.githubusercontent.com/1916723/286957737-b68a5350-5981-42f3-a43f-939a8c01dccd.svg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgwODcyNzksIm5iZiI6MTcxODA4Njk3OSwicGF0aCI6Ii8xOTE2NzIzLzI4Njk1NzczNy1iNjhhNTM1MC01OTgxLTQyZjMtYTQzZi05MzlhOGMwMWRjY2Quc3ZnP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxMSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTFUMDYyMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NDMyYzYwZWZlNThiOTU3YjBlZDBlNGRhYjA0ZWE2YzI1ZGIxMDNiYzJiMTA5NDQ4OTNmOGExMjlmMmFhNDZlZSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.4pl0XAoNzFoH17G0t9P-tCmEoEIvppesifSE1E_khko)
The
<e[0]>
(or in Java:e(0);
) in rulem
resets the operator precedence to zero, which is not correct (see issue #4460 or the tests added by this PR).Proposed solution:
In case we detect a rule reference resulting in a binary or prefix recursion, we introduce an optional parameter
int _dp
(similar to thep
parameter of left recursive rules) with a default of0
to the targeted rule.(This would replace the old
<e[0]>
by a<e[_dp]>
.)This PR does not touch the ATN behavior of ANTLR (well.. almost not. The possible precedence value
_dp
is now excluded during the ATN creation, but where this value is now set, any precedence was 0 before - yay).This MR also introduces tests for additional attributes for rules (but not for left recursive ones, because ANTLR outputs an error message for these:
rule e is left recursive but doesn't conform to a pattern ANTLR can handle
)The precedence option for indirect left-recursive rules is now set to
"_dp"
, which is a new parameter introduced (similar to the existing_p
).Breaking changes:
This PR would prevent optional parameters from being added to parsers for the Dart target via the grammar.
Normal parameters are not affected, thus I would argue that change to be not of importance.
The reason I found this bug at all was that with large grammars,
I wanted to use the same rules (such as formulas) in multiple other rules.
The expected token position of the TestSymbolIssues#testUndefinedLabel test had to be modified by 5 tokens (
(= dp 1)
).Thanks to Ken (@kaby76) for the initial directions over on SO
(By accident this also fixes #4477 :) )
P.S.: I've run a bunch of tests on internal grammars generated using the changes proposed, which so far has not shown additional errors.
P.P.S.: I am not sure if a RuleWalker similar to the
LeftRecursiveRuleWalker
instead of myLeftRecursiveRuleTransformer#isBinaryOrPrefix
method would have been the better option - (but my version is at least easier to step through while debugging)P.P.P.S.: The DOTGenerator is not aware of the presence of the
_dp
value for rererences - if desired, I can add this functionality (but wanted to refrain from doing so before the other changes were discussed.)