Bug 1585201

Summary: The python version shipping with RHE7 is miscompiled
Product: Red Hat Enterprise Linux 7 Reporter: Piyush Bhoot <pbhoot>
Component: pythonAssignee: Victor Stinner <vstinner>
Status: CLOSED NOTABUG QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.4CC: hhorak, mplch, pbhoot, pviktori, torsava, vstinner
Target Milestone: rcFlags: pbhoot: needinfo-
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-16 12:37:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Piyush Bhoot 2018-06-01 13:43:19 UTC
Description of problem:

RedHat's version of _struct.so fails to link against libpython2.7 despite requiring symbols defined by it, i.e. it has been linked incorrectly, leaving symbols unresolved.


Version-Release number of selected component (if applicable):
python-2.7.5-68.el7.x86_64
python-libs-2.7.5-68.el7.x86_64


How reproducible:
Always

Steps to Reproduce:
pythontest.c:
#include <dlfcn.h>

int main(int argc, char *argv[])
{
    void *pylib = dlopen("libpython2.7.so.1.0", RTLD_LOCAL | RTLD_NOW);
    void (*Py_Initialize)(void) = dlsym(pylib, "Py_Initialize");
    Py_Initialize();
    int (*PyRun_SimpleStringFlags)(const char *, void *) = dlsym(pylib, "PyRun_SimpleStringFlags");
    PyRun_SimpleStringFlags("import json\n", 0);
    return 0;
}

2. Compile with "gcc -Wall -o pythontest pythontest.c -ldl -g"

3. Run ./pythontest -

Actual results:

it will fail with ImportError: /usr/lib64/python2.7/lib-dynload/_struct.so: undefined symbol: PyFloat_Type

Expected results:
No error

Additional info:
(optionally) change RTLD_LOCAL to RTLD_GLOBAL and see that it works, indicating that python fails to ensure the symbols it needs are actually available in the appropriate symbol namespace

- Reproducer code runs proper with arm python same version 
LD_LIBRARY_PATH=/arm/tools/python/python/2.7.5/rhe7-x86_64/lib/ and see that it works (no errors printed)

Comment 2 Victor Stinner 2018-08-02 15:07:14 UTC
I compiled Python 2.7 manually on Fedora 28 using:

   ./configure --enable-shared --prefix=/opt/py27shared
   make install

Using that, _struct.so is correctly linked to libpython:

$ ldd /opt/py27shared/lib/python2.7/lib-dynload/_struct.so 
	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f399179f000)
	...

It's not the case for the _struct.so installed by python2-libs-2.7.15-2.fc28.x86_64:

$ ldd /usr/lib64/python2.7/lib-dynload/_struct.so
	linux-vdso.so.1 (0x00007fff414e1000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb8de363000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fb8ddfa4000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fb8de78e000)

I looked at x86_64 build logs of python2-2.7.15-7.el8+7. C compiler flags look fine...

gcc
 -DNDEBUG
 -D_GNU_SOURCE
 -I .
 -I /builddir/build/BUILD/Python-2.7.15/Include
 -I Include
 -O2
 -Wall
 -Werror=format-security
 -Wp,-D_FORTIFY_SOURCE=2
 -Wp,-D_GLIBCXX_ASSERTIONS
 -c /builddir/build/BUILD/Python-2.7.15/Modules/_struct.c
 -fPIC
 -fasynchronous-unwind-tables
 -fcf-protection
 -fexceptions
 -fno-strict-aliasing
 -fstack-clash-protection
 -fstack-protector-strong
 -fwrapv
 -g
 -grecord-gcc-switches
 -m64
 -mtune=generic
 -o Modules/_struct.o
 -pipe
 -pthread
 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1
 -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1

... but the linker uses "-Wl,-z,now -Wl,-z,relro" which can explain the missing libpython dependency on the .so file:

gcc
 -Wl,-z,now
 -Wl,-z,relro
 -pthread
 -shared
 -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
 Modules/_struct.o
 -o Modules/_struct.so

Comment 4 Petr Viktorin (pviktori) 2018-08-22 13:22:57 UTC
Python modules such as _struct should link to libpython2.7.so.1.0. That they don't is a bug.


However, I don't think the reproducer (loading libpython2.7.so.1.0 with RTLD_LOCAL) should be expected to work.
For reference, from `man dlopen`:

  RTLD_LOCAL
   This is the converse of RTLD_GLOBAL, and the default if neither flag is speciā€
   fied.  Symbols defined in this shared object are not made available to resolve
   references in subsequently loaded shared objects.

Python modules (including internal ones) can't work without symbols from libpython2.7.so, and generally expect that Python is loaded before they are imported.

What's the use case behind loading libpython2.7.so with RTLD_LOCAL?

Comment 5 Victor Stinner 2018-08-31 15:26:03 UTC
If I understood correctly, the root issue is the -Wl,-z,now flag passed to LDFLAGS when the _struct.so dynamic module of Python is built (I'm not 100% sure).

Comment 7 Victor Stinner 2018-09-26 17:10:32 UTC
Oh, I think that I identified the root issue. The RPM package modifies Python 2.7 build system to ask to compile some C extensions like _struct using Makefile as a shared library. Sadly, there is a bug in Python: with such configuration, the generated shared library is not linked to libpython2.7.

I reported the bug upstream and I proposed a pull request to fix it:
https://bugs.python.org/issue34814

Comment 8 Victor Stinner 2018-10-16 12:37:56 UTC
According to the discussion at Python upstream ( https://bugs.python.org/issue34814 ), C extensions should not be linked to libpython, and the dynamic linker should find symbols needed by C extensions in the current process. It seems that you must use RTLD_GLOBAL to load libpython for your use case.

Loading libpython with RTLD_LOCAL is not supported in Python. If you really want to use RTLD_LOCAL, you should modify Python to build the extensions that you need as builtin modules rather than extensions (.so libraries). But I'm not sure that it's doable for all C extensions of CPython.

I close the bug is "not a bug" since RTLD_LOCAL doesn't work by design: it's a deliberate choice of Python upstream.

Comment 9 Petr Viktorin (pviktori) 2018-10-16 14:20:42 UTC
Also, why does the customer need to link with RTLD_LOCAL?
Can we help with the underlying use case?