Bug 1687171

Summary: x86_64 FTBFS when building on new xeon cpu
Product: [Fedora] Fedora EPEL Reporter: Tuomo Soini <tis>
Component: python34Assignee: Petr Viktorin <pviktori>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: epel7CC: carl, cstratak, kevin, mhroncok, pviktori, python-sig, TicoTimo, torsava, vstinner
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-17 14:17:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Tuomo Soini 2019-03-10 17:06:11 UTC
There is a test failing on x86_64 when building in mock on a system which is running on new Intel xeon cpu.

host system:

model name	: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz

guest running mock:

model name	: Intel Xeon Processor (Skylake, IBRS)

Happens every time, both python34 and python36.

BUILDSTDERR: test_unregister (test.test_faulthandler.FaultHandlerTests) ... test test_faulthandler failed
FAIL: test_register_chain (test.test_faulthandler.FaultHandlerTests)
Traceback (most recent call last):
  File "/builddir/build/BUILD/Python-3.4.9/Lib/test/test_faulthandler.py", line 602, in test_register_chain
  File "/builddir/build/BUILD/Python-3.4.9/Lib/test/test_faulthandler.py", line 586, in check_register
    self.assertEqual(exitcode, 0)
AssertionError: -11 != 0
Ran 32 tests in 17.006s
FAILED (failures=1)

Patch to fix the issue:

diff --git a/python34.spec b/python34.spec
index c107313..7711006 100644
--- a/python34.spec
+++ b/python34.spec
@@ -1506,7 +1506,7 @@ CheckPython() {
     -x test_ensurepip \
     -x test_venv \
-    %ifarch %{power64} aarch64
+    %ifarch %{power64} aarch64 x86_64
     -x test_faulthandler \
     %ifarch %{power64} s390 s390x armv7hl aarch64

Comment 1 Victor Stinner 2019-03-12 17:16:00 UTC
Well, skipping the test is one option, but as the author of the faulthandler module, I would prefer to see the test passing :-)

Can you please try copy/paste the following code in a script file and run "python3 script.py"?

import faulthandler, signal, os

def handler(signum, frame):
    handler.called = True
handler.called = False


signum = signal.SIGUSR1
signal.signal(signum, handler)
faulthandler.register(signum, chain=True)
os.kill(os.getpid(), signum)
print("called", handler.called)

Expected output:
Current thread 0x00007fc3ad534600 (most recent call first):
  File "script.py", line 12 in <module>
called True

If you get a segfault, there is a bug somewhere.

Maybe try also the script with Python 3.6 ou 3.7 on the same CPU.

test_faulthandler is tested upstream on a wide range of architectures. I'm not aware of any failure, at least on Python 3.7 and 3.8 (dev version).

Comment 2 Tuomo Soini 2019-03-12 18:13:48 UTC
$ python3.4 script.py 
Current thread 0x00007ff7e72e5740 (most recent call first):
  File "script.py", line 12 in <module>
called True
Fatal Python error: Segmentation fault

Current thread 0x00007ff7e72e5740 (most recent call first):
Segmentation fault

$ python3.6 script.py 
Current thread 0x00007f5c1b0a6740 (most recent call first):
  File "script.py", line 12 in <module>
called True
Fatal Python error: Segmentation fault

Current thread 0x00007f5c1b0a6740 (most recent call first):
Segmentation fault

Comment 3 Tuomo Soini 2019-03-12 18:18:03 UTC
On the other hand, if run on kvm host, it works - so this might be issue with kvm guest?

$ python3.6 script.py 
Current thread 0x00007f9107e55740 (most recent call first):
  File "script.py", line 12 in <module>
called True

Comment 4 Tuomo Soini 2019-03-12 18:22:41 UTC
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Server-IBRS</model>
    <feature policy='disable' name='ds'/>
    <feature policy='disable' name='acpi'/>
    <feature policy='require' name='ss'/>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='tm'/>
    <feature policy='disable' name='pbe'/>
    <feature policy='disable' name='dtes64'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='ds_cpl'/>
    <feature policy='disable' name='vmx'/>
    <feature policy='disable' name='smx'/>
    <feature policy='disable' name='est'/>
    <feature policy='disable' name='tm2'/>
    <feature policy='disable' name='xtpr'/>
    <feature policy='disable' name='pdcm'/>
    <feature policy='disable' name='dca'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='disable' name='tsc_adjust'/>
    <feature policy='require' name='clflushopt'/>
    <feature policy='require' name='pku'/>
    <feature policy='disable' name='ospke'/>
    <feature policy='require' name='stibp'/>
    <feature policy='require' name='ssbd'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='arat'/>

CPU config on vm, that is copy from host.

Comment 5 Tuomo Soini 2019-03-12 20:33:32 UTC
Changing CPU typo of vm to Skylake-Client-IBRS works around the issue - so I guess this is bug in virtualization.

Comment 6 Victor Stinner 2019-03-20 18:20:00 UTC
It's still unclear to me if it's a bug in Python or in the virtualization :-(

Comment 7 Victor Stinner 2019-09-03 13:47:23 UTC
This bug has been fixed in Python 3.7, 3.8 and future 3.9: https://bugs.python.org/issue21131

Comment 8 Petr Viktorin 2019-09-17 14:17:59 UTC
Unfortunately, backporting this is not going far enough on my priority list. To rebuild in Mock, please apply the patch locally.