Bug 1898700

Summary: qemu-kvm for RHEL-8.4 doesn't build due to a possible incompatibility with systemtap-sdt-devel-4.4-1
Product: Red Hat Enterprise Linux 8 Reporter: Danilo de Paula <ddepaula>
Component: qemu-kvmAssignee: Stefan Hajnoczi <stefanha>
qemu-kvm sub component: General QA Contact: FuXiangChun <xfu>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: berrange, elima, jinzhao, juzhang, rjones, stefanha, virt-maint, xfu
Version: 8.4   
Target Milestone: rc   
Target Release: 8.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-4.2.0-36.module+el8.4.0+8807+0c3dc3b0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-25 06:45:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Danilo de Paula 2020-11-17 21:38:28 UTC
qemu-kvm from branch "rhel-8.4.0" doesn't build in RHEL-8.4.0 targets.

When building it in a recent rhel-8.4.0 environment (like using a local nightly checkout) or in the latest buildroot (BUILD_TARGET=rhel-8.4.0-candidate in redhat/Makefile), %check fails with the following error:

MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  QTEST_QEMU_BINARY=s390x-softmmu/qemu-system-s390x QTEST_QEMU_IMG=qemu-img tests/modules-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="modules-test" 
Failed to open module: /builddir/build/BUILD/qemu-4.2.0/s390x-softmmu/../block-curl.so: undefined symbol: qemu_curl_close_semaphore
**
ERROR:tests/libqtest.c:498:qtest_rsp: assertion failed (words[0] == "OK"): ("FAIL" == "OK")
ERROR - Bail out! ERROR:tests/libqtest.c:498:qtest_rsp: assertion failed (words[0] == "OK"): ("FAIL" == "OK")
make: *** [/builddir/build/BUILD/qemu-4.2.0/tests/Makefile.include:918: check-qtest-s390x] Error 1
rm tests/modules-test.o

This got unseen because the BUILD_TARGET variable was pointing to an older module build, but got clear in the latest module build.

Reproduce it is very easy: just make sure you have the latest rhel-8.4.0 environment, with systemtap-sdt-devel-4.3-4.el8. Then you can run

make -c redhat rh-local
make check

This should trigger the process.

Also, editing redhat/Makefile and changing BUILD_TARGET to rhel-8.4.0-candidate should also allow you to reproduce the problem in brew with "make -c redhat rh-brew"

This is very high priority as the whole virt:rhel module gets blocked by this.

Comment 2 Danilo de Paula 2020-11-17 22:52:49 UTC
As discussed in IRC, this is in the make %check phase, it doesn't make sense to install qemu-block-curl to test qemu-block-curl.

Just some random info that I've found tracking the problem: qemu_curl_close_semaphore seems to be defined in a dynamic created file block/trace-dtrace.h
That definition behaves differently when 'STAP_SDT_V1' is defined. So here's the relation with systemtap-sdt-devel.

This sounds like a trace-events related issue that happens to be in the modularization bits. I would put at least stephan and maybe daniel on CC.

Comment 3 Daniel Berrangé 2020-11-18 09:14:20 UTC
This brokeness is caused by latest systemtap release which changed the way probes are defined.

QEMU is linking the probes to the main binary, but expecting the probe symbols to be visible to modules. This is no longer the case with latest systemtap. The way QEMU uses tracing with modules was always broken, relying on undefined behaviour. We were lucky to not have problems before, but now it has finally hit us.

This was discussed with systemtap maintainers and they confirm that they won't be making any changes to deal with this and that QEMU needs to be fixed

https://bugzilla.redhat.com/show_bug.cgi?id=1869642

The only option right now is to turn off tracing entirely.

Upstream will need to introduce a dedicated "trace-events" file for each loadable module and ensure it is directly linked to that module.

Comment 4 Gerd Hoffmann 2020-11-18 09:26:35 UTC
It's probably tracepoints for modules being in core qemu.
We've seen that before, can't find the bug atm though.  Dan, do you remember?
Which qemu versions will need that at the end of the day?  5.2 only or also older ones?

Comment 5 Stefan Hajnoczi 2020-11-18 11:09:22 UTC
It's probably related to the fact that tracing is not module-aware. 
Trace events are built into the main binary.

We hit issues in the past where the linker does not pull .o files into
the main binary if they are only needed by a module, causing the tracing
symbols to be left out of the binary. I'm unable to find a BZ for this
right now but I thought we were tracking it.

I haven't been able to reproduce the issue locally yet so I can't say for certain whether this is the root cause, but it seems likely.

Comment 6 Danilo de Paula 2020-11-18 14:57:23 UTC
(In reply to Gerd Hoffmann from comment #4)
> It's probably tracepoints for modules being in core qemu.
> We've seen that before, can't find the bug atm though.  Dan, do you remember?
> Which qemu versions will need that at the end of the day?  5.2 only or also
> older ones?

5.2 for RHEL-AV 8.2 and 4.2 for virt:rhel in RHEL

Comment 7 Danilo de Paula 2020-11-18 14:59:22 UTC
Would it be acceptable to disable dtrace completely (or just for the modules) so we can unblock the whole virt module from building?

Comment 8 Daniel Berrangé 2020-11-18 15:09:47 UTC
It isn't possible to disable it just for modules.   Just disabling tracing temporarily for the whole RPM is the only way to move forward.

We must just be sure consider this bug to be a blocker to get this re-enabled before release.

Comment 9 Stefan Hajnoczi 2020-11-19 11:57:54 UTC
Workaround posted here: https://patchew.org/QEMU/20201119112704.837423-1-stefanha@redhat.com/

Comment 12 Danilo de Paula 2020-11-21 01:14:45 UTC
Could you please grant QA_ACK?

Comment 17 errata-xmlrpc 2021-05-25 06:45:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098