Bug 1687171 - x86_64 FTBFS when building on new xeon cpu
Summary: x86_64 FTBFS when building on new xeon cpu
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: python34
Version: epel7
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Petr Viktorin (pviktori)
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-10 17:06 UTC by Tuomo Soini
Modified: 2019-09-17 14:17 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-17 14:17:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Python 21131 0 None None None 2019-09-03 13:47:23 UTC
Red Hat Bugzilla 1687167 0 unspecified CLOSED x86_64 FTBFS when building on new xeon cpu 2021-02-22 00:41:40 UTC

Internal Links: 1687167

Description Tuomo Soini 2019-03-10 17:06:11 UTC
There is a test failing on x86_64 when building in mock on a system which is running on new Intel xeon cpu.

host system:

model name	: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz

guest running mock:

model name	: Intel Xeon Processor (Skylake, IBRS)

Happens every time, both python34 and python36.

BUILDSTDERR: test_unregister (test.test_faulthandler.FaultHandlerTests) ... test test_faulthandler failed
ok
======================================================================
FAIL: test_register_chain (test.test_faulthandler.FaultHandlerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILD/Python-3.4.9/Lib/test/test_faulthandler.py", line 602, in test_register_chain
    self.check_register(chain=True)
  File "/builddir/build/BUILD/Python-3.4.9/Lib/test/test_faulthandler.py", line 586, in check_register
    self.assertEqual(exitcode, 0)
AssertionError: -11 != 0
----------------------------------------------------------------------
Ran 32 tests in 17.006s
FAILED (failures=1)

Patch to fix the issue:

diff --git a/python34.spec b/python34.spec
index c107313..7711006 100644
--- a/python34.spec
+++ b/python34.spec
@@ -1506,7 +1506,7 @@ CheckPython() {
     -x test_ensurepip \
     -x test_venv \
     %endif
-    %ifarch %{power64} aarch64
+    %ifarch %{power64} aarch64 x86_64
     -x test_faulthandler \
     %endif
     %ifarch %{power64} s390 s390x armv7hl aarch64

Comment 1 Victor Stinner 2019-03-12 17:16:00 UTC
Well, skipping the test is one option, but as the author of the faulthandler module, I would prefer to see the test passing :-)

Can you please try copy/paste the following code in a script file and run "python3 script.py"?

Script:
---
import faulthandler, signal, os

def handler(signum, frame):
    handler.called = True
handler.called = False

faulthandler.enable()

signum = signal.SIGUSR1
signal.signal(signum, handler)
faulthandler.register(signum, chain=True)
os.kill(os.getpid(), signum)
print("called", handler.called)
---

Expected output:
---
Current thread 0x00007fc3ad534600 (most recent call first):
  File "script.py", line 12 in <module>
called True
---

If you get a segfault, there is a bug somewhere.

Maybe try also the script with Python 3.6 ou 3.7 on the same CPU.

test_faulthandler is tested upstream on a wide range of architectures. I'm not aware of any failure, at least on Python 3.7 and 3.8 (dev version).

Comment 2 Tuomo Soini 2019-03-12 18:13:48 UTC
$ python3.4 script.py 
Current thread 0x00007ff7e72e5740 (most recent call first):
  File "script.py", line 12 in <module>
called True
Fatal Python error: Segmentation fault

Current thread 0x00007ff7e72e5740 (most recent call first):
Segmentation fault

$ python3.6 script.py 
Current thread 0x00007f5c1b0a6740 (most recent call first):
  File "script.py", line 12 in <module>
called True
Fatal Python error: Segmentation fault

Current thread 0x00007f5c1b0a6740 (most recent call first):
Segmentation fault

Comment 3 Tuomo Soini 2019-03-12 18:18:03 UTC
On the other hand, if run on kvm host, it works - so this might be issue with kvm guest?

$ python3.6 script.py 
Current thread 0x00007f9107e55740 (most recent call first):
  File "script.py", line 12 in <module>
called True

Comment 4 Tuomo Soini 2019-03-12 18:22:41 UTC
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Server-IBRS</model>
    <vendor>Intel</vendor>
    <feature policy='disable' name='ds'/>
    <feature policy='disable' name='acpi'/>
    <feature policy='require' name='ss'/>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='tm'/>
    <feature policy='disable' name='pbe'/>
    <feature policy='disable' name='dtes64'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='ds_cpl'/>
    <feature policy='disable' name='vmx'/>
    <feature policy='disable' name='smx'/>
    <feature policy='disable' name='est'/>
    <feature policy='disable' name='tm2'/>
    <feature policy='disable' name='xtpr'/>
    <feature policy='disable' name='pdcm'/>
    <feature policy='disable' name='dca'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='disable' name='tsc_adjust'/>
    <feature policy='require' name='clflushopt'/>
    <feature policy='require' name='pku'/>
    <feature policy='disable' name='ospke'/>
    <feature policy='require' name='stibp'/>
    <feature policy='require' name='ssbd'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='arat'/>
  </cpu>

CPU config on vm, that is copy from host.

Comment 5 Tuomo Soini 2019-03-12 20:33:32 UTC
Changing CPU typo of vm to Skylake-Client-IBRS works around the issue - so I guess this is bug in virtualization.

Comment 6 Victor Stinner 2019-03-20 18:20:00 UTC
It's still unclear to me if it's a bug in Python or in the virtualization :-(

Comment 7 Victor Stinner 2019-09-03 13:47:23 UTC
This bug has been fixed in Python 3.7, 3.8 and future 3.9: https://bugs.python.org/issue21131

Comment 8 Petr Viktorin (pviktori) 2019-09-17 14:17:59 UTC
Unfortunately, backporting this is not going far enough on my priority list. To rebuild in Mock, please apply the patch locally.


Note You need to log in before you can comment on or make changes to this bug.