Bug 2026858 - Systemtap probe asm causes GCC to crash on armv7
Summary: Systemtap probe asm causes GCC to crash on armv7
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemtap
Version: 35
Hardware: armv7hl
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Stan Cox
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2021-11-26 09:34 UTC by Daniel Berrangé
Modified: 2022-06-21 18:21 UTC (History)
11 users (show)

Fixed In Version: systemtap-4.7-1.fc35
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-21 18:21:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Daniel Berrangé 2021-11-26 09:34:08 UTC
Description of problem:
Compiling QEMU on either F35 or Rawhide on armv7 results in a compiler crash in GCC:

gcc -Ilibqemuutil.a.p -I. -I.. -Isubprojects/libvhost-user
-I../subprojects/libvhost-user -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/glib-2.0
-I/usr/lib/glib-2.0/include -I/usr/include/sysprof-4 -I/usr/include/libmount
-I/usr/include/blkid -I/usr/include/gio-unix-2.0 -I/usr/include/p11-kit-1
-I/usr/include/pixman-1 -fdiagnostics-color=auto -Wall -Winvalid-pch -std=gnu11 -O2 -g
-isystem /builddir/build/BUILD/qemu-6.1.0/linux-headers -isystem linux-headers -iquote .
-iquote /builddir/build/BUILD/qemu-6.1.0 -iquote /builddir/build/BUILD/qemu-6.1.0/include
-iquote /builddir/build/BUILD/qemu-6.1.0/disas/libvixl -iquote
/builddir/build/BUILD/qemu-6.1.0/tcg/arm -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2
-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes
-Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing
-fno-common -fwrapv -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall
-Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -march=armv7-a
-mfpu=vfpv3-d16 -mtune=generic-armv7-a -mabi=aapcs-linux -mfloat-abi=hard
-Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security
-Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels
-Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs
-Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -DSTAP_SDT_V2 -fPIE -MD -MQ
libqemuutil.a.p/util_vfio-helpers.c.o -MF libqemuutil.a.p/util_vfio-helpers.c.o.d -o
libqemuutil.a.p/util_vfio-helpers.c.o -c ../util/vfio-helpers.c
  during RTL pass: mach
  ../util/vfio-helpers.c: In function 'qemu_vfio_open_pci':
  ../util/vfio-helpers.c:523:1: internal compiler error: in create_fix_barrier, at
config/arm/arm.c:17891
    523 | }
        | ^
  Please submit a full bug report,
  with preprocessed source if appropriate.
  See <http://bugzilla.redhat.com/bugzilla> for instructions.
  Preprocessed source stored into /tmp/ccOFlWwZ.out file, please attach this to your
bugreport.

eg see

   Rawhide: https://koji.fedoraproject.org/koji/taskinfo?taskID=79160330
   F35:     https://koji.fedoraproject.org/koji/taskinfo?taskID=79268250
    https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/GD3ABSWD6HHTNEKV2EJY4PXABQ245UCZ/

Debugging by GCC maintainers revealed it to be crashing in an asm block generated by systemtap probes

The following GCC bug has been filed:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103395

where it turns out the size of the asm block is what is triggering the problem in GCC as it has some assumptions about max asm block size that are violated. 

There's some debate as it whether it should be GCC's problem to fix or SystemTap's but it appears to have been triggered /e xposed by this changed in sdt.h

  https://sourceware.org/git/?p=systemtap.git;a=commit;f=includes/sys/sdt.h;h=eaa15b047688175a94e3ae796529785a3a0af208

but the GCC assumptions appear to have been around for a long time, so that tends towards needing a fix in systemtap so it works with existing GCCs

In QEMU we've worked around the problem by setting -DSTAP_SDT_ARG_CONSTRAINT=g  but unclear if that's an correct approach or not - it compiles but we've not tested probe usage at runtime.  GCC maintainers have suggested some possible changes for sdt.h in the above GCC bug



Version-Release number of selected component (if applicable):
DEBUG util.py:446:   gcc                                     armv7hl  11.2.1-1.fc35                        build   26 M
DEBUG util.py:446:   systemtap-sdt-devel                     armv7hl  4.6-1.fc35                           build   71 k


How reproducible:
Unclear. Only seen on armv7, and not tested arbitrary apps using systemtap, only QEMU

Steps to Reproduce:
1. Build QEMU on arm7 with systemtap support enabled
2.
3.

Actual results:
GCC goes boom - see messages in bug description above.

Expected results:
QEMU compiles successfully

Additional info:

Comment 1 Frank Ch. Eigler 2021-11-26 13:15:46 UTC
I'm leaning toward changing the default operand constraint string on arm7 only, so we work around gcc's problem there, without affecting other platforms.

Comment 2 Andrew John Hughes 2021-11-29 20:27:55 UTC
We've also hit this with OpenJDK: https://kojipkgs.fedoraproject.org//work/tasks/286/79390286/build.log

Same workaround seems to have fixed F35: https://src.fedoraproject.org/rpms/java-latest-openjdk/pull-request/89

Comment 3 Frank Ch. Eigler 2021-12-01 16:01:33 UTC
upstream commit 34facf7ee6b4 should show up in a respin soon

Comment 4 William Cohen 2022-06-21 18:21:23 UTC
systemtap-4.7-1.fc35 has the following patch in it so the qemu workaround for armv7 shouldn't be required now:

commit 34facf7ee6b43dae66cc109973a4eda42e439163
Author: Frank Ch. Eigler <fche>
Date:   Wed Dec 1 10:59:27 2021 -0500

    RHBZ2026858: on __arm__ (arm32), use STAP_SDT_ARG_CONSTRAINT = g

diff --git a/includes/sys/sdt.h b/includes/sys/sdt.h
index 9ecb1cb6f..28d236d91 100644
--- a/includes/sys/sdt.h
+++ b/includes/sys/sdt.h
@@ -106,6 +106,8 @@
 # define STAP_SDT_ARG_CONSTRAINT norw
 # elif defined __s390__ || defined __s390x__
 # define STAP_SDT_ARG_CONSTRAINT        norf
+# elif defined __arm__
+# define STAP_SDT_ARG_CONSTRAINT        g
 # else
 # define STAP_SDT_ARG_CONSTRAINT        nor
 # endif


Note You need to log in before you can comment on or make changes to this bug.