Description of problem: Since upgrading my laptop to Fedora 35 (from Fedora 34), I get a weird error when using the pytester module and "fnmatch_lines()" method. This happens when I match 2 non-consecutive lines and the 2nd line has square brackets. /usr/lib64/python3.10/fnmatch.py:42: in fnmatch return fnmatchcase(name, pat) /usr/lib64/python3.10/fnmatch.py:76: in fnmatchcase match = _compile_pattern(pat) /usr/lib64/python3.10/fnmatch.py:52: in _compile_pattern return re.compile(res).match /usr/lib64/python3.10/re.py:251: in compile return _compile(pattern, flags) /usr/lib64/python3.10/re.py:303: in _compile p = sre_compile.compile(pattern, flags) /usr/lib64/python3.10/sre_compile.py:764: in compile p = sre_parse.parse(p, flags) /usr/lib64/python3.10/sre_parse.py:948: in parse p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0) /usr/lib64/python3.10/sre_parse.py:443: in _parse_sub itemsappend(_parse(source, state, verbose, nested + 1, /usr/lib64/python3.10/sre_parse.py:834: in _parse p = _parse_sub(source, state, sub_verbose, nested + 1) /usr/lib64/python3.10/sre_parse.py:443: in _parse_sub itemsappend(_parse(source, state, verbose, nested + 1, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = <sre_parse.Tokenizer object at 0x7f4777bb86a0>, state = <sre_parse.State object at 0x7f4777bbb4f0> verbose = 0, nested = 3, first = False [...] msg = "bad character range %s-%s" % (this, that) > raise source.error(msg, len(this) + 1 + len(that)) E re.error: bad character range l-3 at position 26 /usr/lib64/python3.10/sre_parse.py:598: error Version-Release number of selected component (if applicable): python3-libs-3.10.0-1.fc35.x86_64 How reproducible: Always Steps to Reproduce: 1. Create "hello" script writing to stdout -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- #!/bin/sh echo "LINE 1" echo "DUMMY LINE" echo "LINE 2 ['/lib64/libnl-3.so.200']" -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- $ chmod +x hello 2. Create pytester python script "test_hello.py" -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- import os import pytest pytest_plugins = 'pytester' EXPECTED = [ "LINE 1", "LINE 2 ['/lib64/libnl-3.so.200']", ] def test_hello(pytester, request): testdir = os.path.dirname(request.fspath) prog = os.path.join(testdir, 'hello') args = [ prog ] result = pytester.run(*args) result.stdout.fnmatch_lines(EXPECTED) -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 3. Execute pytest $ python3 -m pytest Actual results: error above Expected results: success, no error Additional info: This seems to happen only if lines to match are not consecutive: if I remove "DUMMY LINE" from "hello" script, this works fine.
fnmatch is supposed to match Unix-path globs, isn't it? The "LINE 2 ['/lib64/libnl-3.so.200']" glob contains [ and ] characters -- in glob language, it means any character from within that range. However, the range contains repeated characters, which - when converted to regex - seems to be invalid for Python (3.9 and 3.10 alike): >>> import re, fnmatch >>> fnmatch.translate("LINE 2 ['/lib64/libnl-3.so.200']") "(?s:LINE\\ 2\\ ['/lib64/libnl-3.so.200'])\\Z" >>> re.match(fnmatch.translate("LINE 2 ['/lib64/libnl-3.so.200']"), "DUMMY LINE") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python3.9/re.py", line 191, in match return _compile(pattern, flags).match(string) File "/usr/lib64/python3.9/re.py", line 304, in _compile p = sre_compile.compile(pattern, flags) File "/usr/lib64/python3.9/sre_compile.py", line 764, in compile p = sre_parse.parse(p, flags) File "/usr/lib64/python3.9/sre_parse.py", line 948, in parse p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0) File "/usr/lib64/python3.9/sre_parse.py", line 443, in _parse_sub itemsappend(_parse(source, state, verbose, nested + 1, File "/usr/lib64/python3.9/sre_parse.py", line 834, in _parse p = _parse_sub(source, state, sub_verbose, nested + 1) File "/usr/lib64/python3.9/sre_parse.py", line 443, in _parse_sub itemsappend(_parse(source, state, verbose, nested + 1, File "/usr/lib64/python3.9/sre_parse.py", line 598, in _parse raise source.error(msg, len(this) + 1 + len(that)) re.error: bad character range l-3 at position 26 We see that position 26 is the first character in that range that is repeated: >>> fnmatch.translate("LINE 2 ['/lib64/libnl-3.so.200']")[:26] "(?s:LINE\\ 2\\ ['/lib64/libn" >>> fnmatch.translate("LINE 2 ['/lib64/libnl-3.so.200']")[26:] "l-3.so.200'])\\Z" This also happens when using fnmatch() directly: >>> fnmatch.fnmatch("DUMMY LINE", "LINE 2 ['/lib64/libnl-3.so.200']") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python3.9/fnmatch.py", line 42, in fnmatch return fnmatchcase(name, pat) File "/usr/lib64/python3.9/fnmatch.py", line 76, in fnmatchcase match = _compile_pattern(pat) File "/usr/lib64/python3.9/fnmatch.py", line 52, in _compile_pattern return re.compile(res).match File "/usr/lib64/python3.9/re.py", line 252, in compile return _compile(pattern, flags) File "/usr/lib64/python3.9/re.py", line 304, in _compile p = sre_compile.compile(pattern, flags) File "/usr/lib64/python3.9/sre_compile.py", line 764, in compile p = sre_parse.parse(p, flags) File "/usr/lib64/python3.9/sre_parse.py", line 948, in parse p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0) File "/usr/lib64/python3.9/sre_parse.py", line 443, in _parse_sub itemsappend(_parse(source, state, verbose, nested + 1, File "/usr/lib64/python3.9/sre_parse.py", line 834, in _parse p = _parse_sub(source, state, sub_verbose, nested + 1) File "/usr/lib64/python3.9/sre_parse.py", line 443, in _parse_sub itemsappend(_parse(source, state, verbose, nested + 1, File "/usr/lib64/python3.9/sre_parse.py", line 598, in _parse raise source.error(msg, len(this) + 1 + len(that)) re.error: bad character range l-3 at position 26 I don't know what changed here, but I suppose passing in an invalid glob like this was never supposed to work.
As for Python 3.9 vs Python 3.10, I can reproduce the error with your example even with Python 3.9 on Fedora 33 (I don't have a Fedora 34 machine available now).
I can reproduce this in Fedora 34 with Python 3.9 as well. How sure you are that this is only happening "Since upgrading [your] laptop to Fedora 35 (from Fedora 34)"?
Thanks for looking into this. Actually I have a CI for an internal project with these kind of strings running for months, and I'm 99% sure this wasn't happening before. The fact that when matched lines are consecutive this works as I'm expecting puzzles me as well. Anyway, thanks, I think I will go with escaping these lines using glob.escape().
> The fact that when matched lines are consecutive this works as I'm expecting puzzles me as well. That is because of the way pytester implements the check. You can see here: https://github.com/pytest-dev/pytest/blob/6.2.5/src/_pytest/pytester.py#L1841 If the line matches exactly it is considered as a match, and only if it doesn't match exactly it goes further and calls match_func (which is fnamatch in this case). > Actually I have a CI for an internal project with these kind of strings running for months, and I'm 99% sure this wasn't happening before. It might be because it always got the exact match up until now?
Feel free to reopen if you still think this is actually a regression in Python.