Bug 1952351
| Summary: | python-parso fails to build with Python 3.10: Changed enum repr causes test failure | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Tomáš Hrnčiar <thrnciar> |
| Component: | python-parso | Assignee: | Miro Hrončok <mhroncok> |
| Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rawhide | CC: | carl, mhroncok, python-sig, thrnciar |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-04-23 10:43:47 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1890881 | ||
|
Description
Tomáš Hrnčiar
2021-04-22 06:26:28 UTC
Note that the INTERNALERROR is tracked in https://github.com/pytest-dev/pytest/pull/8227 but it IMHO only happens for tests that fail. Error with patched pytest:
=================================== FAILURES ===================================
_ test_ambiguities[foo: bar | baz\nbar: NAME\nbaz: NAME\n-foo is ambiguous.*given a PythonTokenTypes\\.NAME.*bar or baz] _
grammar = 'foo: bar | baz\nbar: NAME\nbaz: NAME\n'
error_match = 'foo is ambiguous.*given a PythonTokenTypes\\.NAME.*bar or baz'
@pytest.mark.parametrize(
'grammar, error_match', [
['foo: bar | baz\nbar: NAME\nbaz: NAME\n',
r"foo is ambiguous.*given a PythonTokenTypes\.NAME.*bar or baz"],
['''foo: bar | baz\nbar: 'x'\nbaz: "x"\n''',
r"foo is ambiguous.*given a ReservedString\(x\).*bar or baz"],
['''foo: bar | 'x'\nbar: 'x'\n''',
r"foo is ambiguous.*given a ReservedString\(x\).*bar or foo"],
# An ambiguity with the second (not the first) child of a production
['outer: "a" [inner] "b" "c"\ninner: "b" "c" [inner]\n',
r"outer is ambiguous.*given a ReservedString\(b\).*inner or outer"],
# An ambiguity hidden by a level of indirection (middle)
['outer: "a" [middle] "b" "c"\nmiddle: inner\ninner: "b" "c" [inner]\n',
r"outer is ambiguous.*given a ReservedString\(b\).*middle or outer"],
]
)
def test_ambiguities(grammar, error_match):
with pytest.raises(ValueError, match=error_match):
> generate_grammar(grammar, tokenize.PythonTokenTypes)
test/test_pgen2.py:357:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
bnf_grammar = 'foo: bar | baz\nbar: NAME\nbaz: NAME\n'
token_namespace = <enum 'PythonTokenTypes'>
def generate_grammar(bnf_grammar: str, token_namespace) -> Grammar:
"""
``bnf_text`` is a grammar in extended BNF (using * for repetition, + for
at-least-once repetition, [] for optional parts, | for alternatives and ()
for grouping).
It's not EBNF according to ISO/IEC 14977. It's a dialect Python uses in its
own parser.
"""
rule_to_dfas = {}
start_nonterminal = None
for nfa_a, nfa_z in GrammarParser(bnf_grammar).parse():
# _dump_nfa(nfa_a, nfa_z)
dfas = _make_dfas(nfa_a, nfa_z)
# _dump_dfas(dfas)
# oldlen = len(dfas)
_simplify_dfas(dfas)
# newlen = len(dfas)
rule_to_dfas[nfa_a.from_rule] = dfas
# print(nfa_a.from_rule, oldlen, newlen)
if start_nonterminal is None:
start_nonterminal = nfa_a.from_rule
reserved_strings: Mapping[str, ReservedString] = {}
for nonterminal, dfas in rule_to_dfas.items():
for dfa_state in dfas:
for terminal_or_nonterminal, next_dfa in dfa_state.arcs.items():
if terminal_or_nonterminal in rule_to_dfas:
dfa_state.nonterminal_arcs[terminal_or_nonterminal] = next_dfa
else:
transition = _make_transition(
token_namespace,
reserved_strings,
terminal_or_nonterminal
)
dfa_state.transitions[transition] = DFAPlan(next_dfa)
> _calculate_tree_traversal(rule_to_dfas)
parso/pgen2/generator.py:278:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
nonterminal_to_dfas = {'bar': [<DFAState: bar is_final=False>, <DFAState: bar is_final=True>], 'baz': [<DFAState: baz is_final=False>, <DFAState: baz is_final=True>], 'foo': [<DFAState: foo is_final=False>, <DFAState: foo is_final=True>]}
def _calculate_tree_traversal(nonterminal_to_dfas):
"""
By this point we know how dfas can move around within a stack node, but we
don't know how we can add a new stack node (nonterminal transitions).
"""
# Map from grammar rule (nonterminal) name to a set of tokens.
first_plans = {}
nonterminals = list(nonterminal_to_dfas.keys())
nonterminals.sort()
for nonterminal in nonterminals:
if nonterminal not in first_plans:
_calculate_first_plans(nonterminal_to_dfas, first_plans, nonterminal)
# Now that we have calculated the first terminals, we are sure that
# there is no left recursion.
for dfas in nonterminal_to_dfas.values():
for dfa_state in dfas:
transitions = dfa_state.transitions
for nonterminal, next_dfa in dfa_state.nonterminal_arcs.items():
for transition, pushes in first_plans[nonterminal].items():
if transition in transitions:
prev_plan = transitions[transition]
# Make sure these are sorted so that error messages are
# at least deterministic
choices = sorted([
(
prev_plan.dfa_pushes[0].from_rule
if prev_plan.dfa_pushes
else prev_plan.next_dfa.from_rule
),
(
pushes[0].from_rule
if pushes else next_dfa.from_rule
),
])
> raise ValueError(
"Rule %s is ambiguous; given a %s token, we "
"can't determine if we should evaluate %s or %s."
% (
(
dfa_state.from_rule,
transition,
) + tuple(choices)
)
)
E ValueError: Rule foo is ambiguous; given a NAME token, we can't determine if we should evaluate bar or baz.
parso/pgen2/generator.py:339: ValueError
During handling of the above exception, another exception occurred:
grammar = 'foo: bar | baz\nbar: NAME\nbaz: NAME\n'
error_match = 'foo is ambiguous.*given a PythonTokenTypes\\.NAME.*bar or baz'
> ???
E AssertionError: Regex pattern 'foo is ambiguous.*given a PythonTokenTypes\\.NAME.*bar or baz' does not match "Rule foo is ambiguous; given a NAME token, we can't determine if we should evaluate bar or baz.".
test/test_pgen2.py:-1: AssertionError
=========================== short test summary info ============================
FAILED test/test_pgen2.py::test_ambiguities[foo: bar | baz\nbar: NAME\nbaz: NAME\n-foo is ambiguous.*given a PythonTokenTypes\\.NAME.*bar or baz]
There were some changes in repr() in latest alpha of Python 3.10. https://docs.python.org/3.10/whatsnew/changelog.html#python-3-10-0-alpha-7 bpo-40066: Enum: adjust repr() to show only enum and member name (not value, nor angle brackets) and str() to show only member name. Update and improve documentation to match. bpo-40066: Enum’s repr() and str() have changed: repr() is now EnumClass.MemberName and str() is MemberName. Additionally, stdlib Enum’s whose contents are available as module attributes, such as RegexFlag.IGNORECASE, have their repr() as module.name, e.g. re.IGNORECASE. https://bugs.python.org/issue40066 Patched in copr with https://github.com/davidhalter/parso/pull/186 Will give upstream some time for feedback before backporting. |