Hide Forgot
When run under the C locale, Python 3 doesn't really work properly on systems where UTF-8 is the correct encoding for interacting with the rest of the system. This is described in detail by Armin Ronacher in the click documentation: http://click.pocoo.org/5/python3/#python-3-surrogate-handling The attached patch is a proposed change to the system Python that assumes the current process is misconfigured when it detects that "LC_CTYPE" refers to the "C" locale, and in that case prints a warnings to stderr and forces the use of the C.UTF-8 locale instead. To avoid unintended side effects, it *solely* changes the actual python3.6 command line utility - nothing changes for cases where CPython is used as a dynamically linked library. Behaviour with the patch: ``` $ LANG=C python -c 'import click; cli = click.command()(lambda:None); cli()' Python detected LC_CTYPE=C. Setting LC_ALL & LANG to C.UTF-8. ``` Behaviour without the patch: ``` $ LANG=C /usr/bin/python3 -c 'import click; cli = click.command()(lambda:None); cli()' Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/ncoghlan/.local/lib/python3.5/site-packages/click/core.py", line 716, in __call__ return self.main(*args, **kwargs) File "/home/ncoghlan/.local/lib/python3.5/site-packages/click/core.py", line 675, in main _verify_python3_env() File "/home/ncoghlan/.local/lib/python3.5/site-packages/click/_unicodefun.py", line 119, in _verify_python3_env 'mitigation steps.' + extra) RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. Either run this under Python 2 or consult http://click.pocoo.org/python3/ for mitigation steps. This system supports the C.UTF-8 locale which is recommended. You might be able to resolve your issue by exporting the following environment variables: export LC_ALL=C.UTF-8 export LANG=C.UTF-8 ```
Created attachment 1231981 [details] Initially proposed patch to the Fedora Python 3.6 binary for F26
We plan to apply the patch after the current Python 3.6 side-tag is merged.
Note from the current Python SIG discussion: this needs an environment variable to turn off the second-guessing behaviour. e.g PYTHONALLOWCLOCALE
...considering the discussion of course. Didn't have time now to read it all.
Consolidating the feedback from the mailing list thread [1] so far: To better cover the runtime embedding cases, add a new warning message inside Py_Initialize that says: libpython3 detected LC_CTYPE=C. Some libraries and operating system interfaces may not work correctly. Use `PYTHONALLOWCLOCALE=1 LC_CTYPE=C /usr/bin/python3` if debugging this under /usr/bin/python3. For the command line interpreter, provide the `PYTHONALLOWCLOCALE` off switch, and adjust the warning message as follows: Python detected LC_CTYPE=C, forcing LC_ALL & LANG to C.UTF-8 (set PYTHONALLOWCLOCALE to disable this behaviour) And if the environment variable is already set: Python detected LC_CTYPE=C, but PYTHONALLOWCLOCALE is set. Some libraries, applications and operating system interfaces may not work correctly. [1] https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org/thread/NBYPZLLAA7SNHOZ4TYMDTLJIKACLVTUM/
Why would one want to allow the C locale though? Is there any case where LANG=C works but C.utf-8 doesn't (for Python 3)?
(In reply to Jan Niklas Hasse from comment #6) > Why would one want to allow the C locale though? Is there any case where > LANG=C works but C.utf-8 doesn't (for Python 3)? The use case is debugging. These are the three cases we've come up with in the mailing list thread where it would be desirable to use C locale instead of C.utf-8 for debugging: * I am a software developer and the user is running my software with python-3.6 on a distribution that doesn't patch their Python3. * I am a software developer and the user is running my softwware on an older version of python-3.x that doesn't have this change. * I am a software developer and I'm running in production under mod_wsgi but debugging/running unittests/etc using /usr/bin/python3. In those cases, the production version of the software won't be coercing to a non-ascii-aware locale. Being able to turn off the coercion when running /usr/bin/python3 is the quickest way to replicate the problems that can be encountered in the production environment.
Created attachment 1233034 [details] Draft implementation with environment based off switch and test cases The updated PYTHONALLOWCLOCALE patch covers everything discussed both here and in the SIG thread, and also adds a new test case for the behaviour. The current patch refactors test.support.script_helper slightly, but we should probably just duplicate that code to make the patch easier to maintain and leave any refactoring for the upstream implementation. Example behaviour: ========================== $ ./python -c "import sys; print(sys.getfilesystemencoding())" utf-8 $ LANG=C.UTF-8 ./python -c "import sys; print(sys.getfilesystemencoding())" utf-8 $ LANG=C ./python -c "import sys; print(sys.getfilesystemencoding())" Python detected LC_CTYPE=C, forcing LC_ALL & LANG to C.UTF-8 (set PYTHONALLOWCLOCALE to disable this behaviour). utf-8 $ PYTHONALLOWCLOCALE=1 LANG=C ./python -c "import sys; print(sys.getfilesystemencoding())" Python detected LC_CTYPE=C, but PYTHONALLOWCLOCALE is set. Some libraries, applications, and operating system interfaces may not work correctly. Py_Initialize detected LC_CTYPE=C, which limits Unicode compatibility. Some libraries and operating system interfaces may not work correctly. Use `PYTHONALLOWCLOCALE=1 LC_CTYPE=C python3` to configure a similar environment when running Python directly. ascii ========================== The reason the library warning also shows up in the last example is that from the library's point of view, that CLI invocation looks exactly the same as any other embedding application with a problematic locale configuration. One possible option to make that case a bit more readable would be to omit the CLI warning for it, and rely solely on the warning from the library.
I also made an interesting discovery while working on this patch: the Py_Initialize code already includes a call to `setlocale(LC_CTYPE, "")` that never gets reverted (the runtime doesn't even save a reference to the old setting for subsequent restoration). So that means embedding applications already have to set `LC_ALL`, `LC_CTYPE` or `LANG` in the environment if they want an embedded CPython 3 runtime to pay attention to it - they can't just call `setlocale()` before calling Py_Initialize.
Does this also improve the handling of encoding change from utf-8 to ascii if stdout is not a tty from bug #1397428?
Thomas, bug 1397428 is for Python 2. In py3 it should be fixed already.
(In reply to Petr Viktorin from comment #11) > Thomas, bug 1397428 is for Python 2. In py3 it should be fixed already. OK,thanks for the clarification.
I've now created an upstream PEP targeting Python 3.7 for this: https://www.python.org/dev/peps/pep-0538/ An updated patch (which tweaks the warning messages a bit and avoids emitting the double warning when PYTHONALLOWCLOCALE is set) is attached to the corresponding upstream issue: http://bugs.python.org/issue28180 Assuming that gets accepted some time in the next few weeks, would it make sense to file a Self-Contained Change Proposal for F26 to cover the backport to Python 3.6?
I suppose that a self contained change is not required for that kind of patch. Currently, if that gets accepted upstream we will backport it in rawhide.
I'm creating the self contained change page. However when I tried to apply the patch compilation fails with: /builddir/build/BUILD/Python-3.6.0/Python/pylifecycle.c: In function '_emit_stderr_warning_for_c_locale': /builddir/build/BUILD/Python-3.6.0/Python/pylifecycle.c:315:9: error: format not a string literal and no format arguments [-Werror=format-security] fprintf(stderr, _C_LOCALE_WARNING);
Fixed by changing these two fprintf statements from: fprintf(stderr, _CLI_C_LOCALE_COERCION_WARNING); to: fprintf(stderr, "%s", _CLI_C_LOCALE_COERCION_WARNING);
self contained change proposal: https://fedoraproject.org/wiki/Changes/python3_c.utf-8_locale
What happens with the patch when LANG is unset? > In those cases, the production version of the software won't be coercing to a non-ascii-aware locale. Being able to turn off the coercion when running /usr/bin/python3 is the quickest way to replicate the problems that can be encountered in the production environment. Couldn't LANG=C.ascii be used in these cases?
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle. Changing version to '26'.
Proposed as a Freeze Exception for 26-alpha by Fedora user cstratak using the blocker tracking app because: As the scheduling didn't work out, I will have to request a freeze exception for https://fedoraproject.org/wiki/Changes/python3_c.utf-8_locale in order to be tested extensively.
python3-3.6.0-21.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-904a280f5f
Discussed in today's Blocker Review meeting. FESCo has declared this an FE: https://meetbot-raw.fedoraproject.org/teams/fesco/fesco.2017-03-10-16.02.html
python3-3.6.0-21.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-904a280f5f
autoconf-archive-2016.09.16-3.fc26 python3-3.6.0-21.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-904a280f5f
autoconf-archive-2016.09.16-3.fc26, python3-3.6.0-21.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-904a280f5f
autoconf-archive-2016.09.16-3.fc26, python3-3.6.0-21.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.
Progress update on the upstream PEP: it's getting pretty close to acceptance, but the latest round of reviews prompted a change to make the behaviour more consistent between the "locale coercion" case and the "explicit locale configuration case". Specifically, instead of calling Py_SetStandardStreamEncoding from the standalone CLI, the language runtime initialization now automatically uses "surrogateescape" on the standard streams whenever the configured locale is one of the potential coercion target locales (similar to the way the default error handler for the standard streams has been set to "surrogateescape" in the C locale since Python 3.5). As a non-functional change, the upstream patch has also been refactored to move all of the implementation details into the shared library, with just a couple of private API functions accessed from the standalone CLI implementation. The changes for both of those relative to the previous patch: https://github.com/ncoghlan/cpython/commit/188e7807b6d9e49377aacbb287c074e5cabf70c5 The bulk of the related specification changes are in https://github.com/python/peps/commit/2fb53e7c1bbb04e1321bca11cc0112aec69f6398 with an important clarification in https://github.com/python/peps/commit/4067701b851031b10a37300375735f8489afb4e6 to note that the handling of sys.stderr isn't change (that continues to use "backslashreplace" as its error handler, regardless of locale) With the Beta freeze only a week away, it probably makes sense to update the backport ASAP - the only remaining open question upstream is an idea we wouldn't backport to 3.6 anyway (specifically, I've suggested we consider making the private configuration API in the current patch a public API in Python 3.7)
(In reply to Nick Coghlan from comment #27) > Progress update on the upstream PEP: it's getting pretty close to > acceptance, but the latest round of reviews prompted a change to make the > behaviour more consistent between the "locale coercion" case and the > "explicit locale configuration case". > > Specifically, instead of calling Py_SetStandardStreamEncoding from the > standalone CLI, the language runtime initialization now automatically uses > "surrogateescape" on the standard streams whenever the configured locale is > one of the potential coercion target locales (similar to the way the default > error handler for the standard streams has been set to "surrogateescape" in > the C locale since Python 3.5). > > As a non-functional change, the upstream patch has also been refactored to > move all of the implementation details into the shared library, with just a > couple of private API functions accessed from the standalone CLI > implementation. > > The changes for both of those relative to the previous patch: > https://github.com/ncoghlan/cpython/commit/ > 188e7807b6d9e49377aacbb287c074e5cabf70c5 > > The bulk of the related specification changes are in > https://github.com/python/peps/commit/ > 2fb53e7c1bbb04e1321bca11cc0112aec69f6398 with an important clarification in > https://github.com/python/peps/commit/ > 4067701b851031b10a37300375735f8489afb4e6 to note that the handling of > sys.stderr isn't change (that continues to use "backslashreplace" as its > error handler, regardless of locale) > > With the Beta freeze only a week away, it probably makes sense to update the > backport ASAP - the only remaining open question upstream is an idea we > wouldn't backport to 3.6 anyway (specifically, I've suggested we consider > making the private configuration API in the current patch a public API in > Python 3.7) Thanks for the update Nick. I'll update the backport as soon as possible.
Commits have been pushed [0][1] and builds have been created [2][3][4] for F26 and rawhide [0] http://pkgs.fedoraproject.org/cgit/rpms/python3.git/commit/?id=31fe33b583978fe0075269527bf0eba08d233db9 [1] http://pkgs.fedoraproject.org/cgit/rpms/python3.git/commit/?h=f26&id=8fbcd4d716c245670956a0d1a0f10f9a79d41888 [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=887359 [3] https://koji.fedoraproject.org/koji/buildinfo?buildID=887360 [4] https://bodhi.fedoraproject.org/updates/FEDORA-2017-0a50d556e2
Thanks. Based on Inada-san's comments, there's likely to be at least one more notable change in the upstream version: changing the locale coercion to only set LANG and LC_CTYPE without setting the LC_ALL override. Context for that: https://mail.python.org/pipermail/python-dev/2017-May/147896.html There are a couple of additional changes being considered upstream for 3.7 (removing the coercion warning in 3.8, exposing the legacy locale detection and coercion as a public CPython API), but neither of those would affect the Fedora 3.6 backport.
OK, I've pushed the code update to my sandbox branch that changes the locale coercion to always respect LC_ALL rather than attempting to override it: https://github.com/ncoghlan/cpython/commit/476a78133c94d82e19b89f50036cecd9b4214e7a If you set LC_ALL=C, CPython won't attempt to change it, but will complain about it. If you set PYTHONCOERCECLOCALE=0, CPython not only won't change the C locale, but won't complain about it either. Since the locale coercion now only sets LC_CTYPE & LANG, that means it also respects other explicitly set locale categories (like LC_TIME, LC_CURRENCY and LC_MONETARY). I'll be pushing the corresponding update to the PEP itself shortly.
The published PEP has been updated a new python-dev review thread started: https://mail.python.org/pipermail/python-dev/2017-May/147904.html
python3-3.6.1-6.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-1e3062e0d6
python3-3.6.1-6.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-1e3062e0d6
python3-3.6.1-6.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.
I'm happy to report that PEP 538 is now accepted upstream: https://mail.python.org/pipermail/python-dev/2017-May/148035.html There's one change relative to the behaviour of the downstream patch in the Fedora 26 Beta: upstream CPython will *only* set LC_CTYPE, and leave LANG alone. The rationale for that final change is here: https://www.python.org/dev/peps/pep-0538/#avoiding-setting-lang-for-utf-8-locale-coercion Rather than filing a new issue, I figure it makes sense to just set this one back to assigned, and then run it through the update cycle again.
The latest upstream implementation removes the warning when the locale has been coerced, thus proposing a freeze exception for the new build of python3.
Proposed as a Freeze Exception for 26-final by Fedora user cstratak using the blocker tracking app because: Update to the latest upstream implementation of PEP 538 (rhbz#1432866) which removed the warning about locale coercion to stderr.
+1 FE, for me.
python3-3.6.1-8.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-6b4e8ce90d
python3-3.6.1-8.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-6b4e8ce90d
python3-3.6.1-8.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.