Bug 1743477

Summary: Since bd94bc06479a "spapr: change default interrupt mode to 'dual'", QEMU resets the machine to select the appropriate interrupt controller. And -no-reboot prevents that.
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Min Deng <mdeng>
Component: qemu-kvmAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Min Deng <mdeng>
Severity: medium Docs Contact:
Priority: high    
Version: 8.1CC: bugproxy, dgibson, fnovak, hannsj_uhl, juzhang, knoel, lvivier, mdeng, ngu, ptoscano, qzhang, virt-maint
Target Milestone: rcKeywords: Regression
Target Release: 8.1   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-4.1.0-8.module+el8.1.0+4199+446e40fc Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-06 07:19:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 910269, 1741619    
Bug Blocks: 1624641    

Description Min Deng 2019-08-20 05:23:31 UTC
Description of problem:
Since bd94bc06479a "spapr: change default interrupt mode to 'dual'", QEMU resets the machine to select the appropriate interrupt controller. And -no-reboot prevents that.

Version-Release number of selected component (if applicable):
qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.ppc64le

How reproducible:
always

Steps to Reproduce:

[root@ibm-p9wr-17 libguestfs-1.40.2]# make quickcheck LIBGUESTFS_BACKEND_SETTINGS=force_tcg      LIBGUESTFS_HV=/usr/libexec/qemu-kvm
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
./run test-tool/libguestfs-test-tool 
     ************************************************************
     *                    IMPORTANT NOTICE
     *
     * When reporting bugs, include the COMPLETE, UNEDITED
     * output below in your bug report.
     *
     ************************************************************
LD_LIBRARY_PATH=/home/libguestfs-1.40.2/ruby/ext/guestfs:/home/libguestfs-1.40.2/lib/.libs:/home/libguestfs-1.40.2/java/.libs:/home/libguestfs-1.40.2/gobject/.libs
LIBGUESTFS_BACKEND_SETTINGS=force_tcg
LIBGUESTFS_CACHEDIR=/home/libguestfs-1.40.2/tmp
LIBGUESTFS_HV=/usr/libexec/qemu-kvm
LIBGUESTFS_TMPDIR=/home/libguestfs-1.40.2/tmp
LIBGUESTFS_PATH=/home/libguestfs-1.40.2/appliance
PATH=/home/libguestfs-1.40.2/v2v:/home/libguestfs-1.40.2/tools:/home/libguestfs-1.40.2/test-tool:/home/libguestfs-1.40.2/sysprep:/home/libguestfs-1.40.2/sparsify:/home/libguestfs-1.40.2/resize:/home/libguestfs-1.40.2/rescue:/home/libguestfs-1.40.2/p2v:/home/libguestfs-1.40.2/make-fs:/home/libguestfs-1.40.2/inspector:/home/libguestfs-1.40.2/get-kernel:/home/libguestfs-1.40.2/fuse:/home/libguestfs-1.40.2/format:/home/libguestfs-1.40.2/fish:/home/libguestfs-1.40.2/erlang:/home/libguestfs-1.40.2/edit:/home/libguestfs-1.40.2/diff:/home/libguestfs-1.40.2/dib:/home/libguestfs-1.40.2/df:/home/libguestfs-1.40.2/customize:/home/libguestfs-1.40.2/cat:/home/libguestfs-1.40.2/builder:/home/libguestfs-1.40.2/align:/home/kar/workspace/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
XDG_RUNTIME_DIR=/run/user/0
SELinux: Enforcing
guestfs_get_append: (null)
guestfs_get_autosync: 1
guestfs_get_backend: direct
guestfs_get_backend_settings: [force_tcg]
guestfs_get_cachedir: /home/libguestfs-1.40.2/tmp
guestfs_get_hv: /usr/libexec/qemu-kvm
guestfs_get_memsize: 1024
guestfs_get_network: 0
guestfs_get_path: /home/libguestfs-1.40.2/appliance
guestfs_get_pgroup: 0
guestfs_get_program: libguestfs-test-tool
guestfs_get_recovery_proc: 1
guestfs_get_smp: 1
guestfs_get_sockdir: /tmp
guestfs_get_tmpdir: /home/libguestfs-1.40.2/tmp
guestfs_get_trace: 0
guestfs_get_verbose: 1
host_cpu: powerpc64le
Launching appliance, timeout set to 600 seconds.
libguestfs: launch: program=libguestfs-test-tool
libguestfs: launch: version=1.40.2
libguestfs: launch: backend registered: unix
libguestfs: launch: backend registered: uml
libguestfs: launch: backend registered: direct
libguestfs: launch: backend=direct
libguestfs: launch: tmpdir=/home/libguestfs-1.40.2/tmp/libguestfs5mOzvn
libguestfs: launch: umask=0022
libguestfs: launch: euid=0
libguestfs: begin building supermin appliance
libguestfs: run supermin
libguestfs: command: run: /usr/bin/supermin
libguestfs: command: run: \ --build
libguestfs: command: run: \ --verbose
libguestfs: command: run: \ --if-newer
libguestfs: command: run: \ --lock /home/libguestfs-1.40.2/tmp/.guestfs-0/lock
libguestfs: command: run: \ --copy-kernel
libguestfs: command: run: \ -f ext2
libguestfs: command: run: \ --host-cpu powerpc64le
libguestfs: command: run: \ /home/libguestfs-1.40.2/appliance/supermin.d
libguestfs: command: run: \ -o /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d
supermin: version: 5.1.19
supermin: rpm: detected RPM version 4.14
supermin: package handler: fedora/rpm
supermin: acquiring lock on /home/libguestfs-1.40.2/tmp/.guestfs-0/lock
supermin: build: /home/libguestfs-1.40.2/appliance/supermin.d
supermin: reading the supermin appliance
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/base.tar.gz type gzip base image (tar)
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/daemon.tar.gz type gzip base image (tar)
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/excludefiles type uncompressed excludefiles
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/hostfiles type uncompressed hostfiles
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/init.tar.gz type gzip base image (tar)
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/packages type uncompressed packages
supermin: build: visiting /home/libguestfs-1.40.2/appliance/supermin.d/udev-rules.tar.gz type gzip base image (tar)
supermin: mapping package names to installed packages
supermin: resolving full list of package dependencies
supermin: build: 178 packages, including dependencies
supermin: build: 32589 files
supermin: build: 7104 files, after matching excludefiles
supermin: build: 7105 files, after adding hostfiles
supermin: build: 7091 files, after removing unreadable files
supermin: build: 7123 files, after munging
supermin: kernel: looking for kernel using environment variables ...
supermin: kernel: looking for kernels in /lib/modules/*/vmlinuz ...
supermin: kernel: picked vmlinuz /lib/modules/4.18.0-134.el8.ppc64le/vmlinuz
supermin: kernel: kernel_version 4.18.0-134.el8.ppc64le
supermin: kernel: modpath /lib/modules/4.18.0-134.el8.ppc64le
supermin: ext2: creating empty ext2 filesystem '/home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d.jh71vgfr/root'
supermin: ext2: populating from base image
supermin: ext2: copying files from host filesystem
supermin: ext2: copying kernel modules
supermin: ext2: creating minimal initrd '/home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d.jh71vgfr/initrd'
supermin: ext2: wrote 22 modules to minimal initrd
supermin: renaming /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d.jh71vgfr to /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d
libguestfs: finished building supermin appliance
libguestfs: begin testing qemu features
libguestfs: checking for previously cached test results of /usr/libexec/qemu-kvm, in /home/libguestfs-1.40.2/tmp/.guestfs-0
libguestfs: loading previously cached test results
libguestfs: qemu version: 4.0
libguestfs: qemu mandatory locking: yes
libguestfs: qemu KVM: enabled
libguestfs: finished testing qemu features
libguestfs: command: run: dmesg | grep -Eoh 'lpj=[[:digit:]]+'
libguestfs: read_lpj_from_dmesg: external command exited with error status 1
libguestfs: read_lpj_from_files: no boot messages files are readable
/usr/libexec/qemu-kvm \
    -global virtio-blk-pci.scsi=off \
    -no-user-config \
    -enable-fips \
    -nodefaults \
    -display none \
    -machine pseries,accel=tcg \
    -m 1024 \
    -no-reboot \
    -rtc driftfix=slew \
    -kernel /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d/kernel \
    -initrd /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d/initrd \
    -object rng-random,filename=/dev/urandom,id=rng0 \
    -device virtio-rng-pci,rng=rng0 \
    -device virtio-scsi-pci,id=scsi \
    -drive file=/home/libguestfs-1.40.2/tmp/libguestfs5mOzvn/scratch1.img,cache=unsafe,format=raw,id=hd0,if=none \
    -device scsi-hd,drive=hd0 \
    -drive file=/home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw \
    -device scsi-hd,drive=appliance \
    -device virtio-serial-pci \
    -serial stdio \
    -chardev socket,path=/tmp/libguestfsM8yjvT/guestfsd.sock,id=channel0 \
    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
    -append "panic=1 console=hvc0 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color"
qemu-kvm: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-kvm: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-kvm: warning: TCG doesn't support requested feature, cap-ibs=workaround
qemu-kvm: warning: global mc146818rtc.lost_tick_policy has invalid class name


SLOF\x1b[0m\x1b[?25l **********************************************************************
\x1b[1mQEMU Starting
\x1b[0m Build Date = Jul 23 2019 04:40:55
 FW Version = mockbuild@ release 20190703
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
                     00 0000 (D) : 1af4 1005    unknown-legacy-device*
                     00 0800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/scsi@1
       SCSI: Looking for devices
          100000000000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
          101000000000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
                     00 1000 (D) : 1af4 1003    virtio [ serial ]
No NVRAM common partition, re-initializing...
Scanning USB 
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (1a7b860 bytes) 
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.18.0-134.el8.ppc64le (mockbuild.eng.bos.redhat.com) (gcc version 8.3.1 20190507 (Red Hat 8.3.1-4) (GCC)) #1 SMP Thu Aug 15 17:39:09 UTC 2019
Detected machine type: 0000000000000101
command line: panic=1 console=hvc0 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support...libguestfs: error: appliance closed the connection unexpectedly, see earlier error messages
libguestfs: child_cleanup: 0x3e420d30: child process died
libguestfs: sending SIGTERM to process 139946
libguestfs: qemu maxrss 153472K
libguestfs: error: guestfs_launch failed, see earlier error messages
libguestfs: closing guestfs handle 0x3e420d30 (state 0)
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /home/libguestfs-1.40.2/tmp/libguestfs5mOzvn
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /tmp/libguestfsM8yjvT
make: *** [Makefile:2946: quickcheck] Error 1


Actual results:
it fails because of the "-no-reboot" parameter.
Since bd94bc06479a "spapr: change default interrupt mode to 'dual'", QEMU resets the machine to select the appropriate interrupt controller. And -no-reboot prevents that.


Expected results:
The libguest test can finish.

Additional info:

Also refer to bug 1726075

Comment 1 Min Deng 2019-08-20 05:28:39 UTC
It's not reproducible on build qemu-kvm-core-4.0.0-6.module+el8.1.0+3736+a2aefea3.ppc64le so it should be regression problem,and feel free to update if you have any concerns.

Comment 2 Laurent Vivier 2019-08-20 08:32:52 UTC
Simpler reproducer:

curl -O http://download-ipv4.eng.brq.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.1.0/compose/BaseOS/ppc64le/os/ppc/ppc64/vmlinuz
/usr/libexec/qemu-kvm -M accel=tcg --nodefaults -serial mon:stdio -kernel ~/vmlinuz -nographic  -no-reboot
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround


SLOF **********************************************************************
QEMU Starting
 Build Date = Jul  3 2019 12:26:14
 FW Version = git-ba1ab360eebe6338
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
No NVRAM common partition, re-initializing...
Scanning USB 
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (1a7b860 bytes) 
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.18.0-128.el8.ppc64le (mockbuild.eng.bos.redhat.com) (gcc version 8.3.1 20190507 (Red Hat 8.3.1-4) (GCC)) #1 SMP Fri Aug 2 14:52:33 UTC 2019
Detected machine type: 0000000000000101
command line: 
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support...[root@localhost ~]#

Comment 3 Laurent Vivier 2019-08-20 12:59:22 UTC
David,

do you know if upstream will be fixed soon to avoid to have a reset when ic-mode is dual?

Thanks

Comment 4 David Gibson 2019-08-21 06:44:17 UTC
The fact it breaks libguestfs has greatly raised the priority with which I'm considering this.

Unfortunately, figuring out how to fix it will be a bit tricky.

I'm flaggint this for RHEL-AV-8.1, though.

Comment 5 David Gibson 2019-08-28 02:57:48 UTC
Ok, it turns out this is easier to fix than I anticipated.  I'd still like to remove CAS reboots entirely, because they cause other problems, but there's an easier fix for this specific problem in the meantime.

Seems s390 had a somewhat similar problem, and introduced a SHUTDOWN_CAUSE_SUBSYSTEM_RESET option that ignores -no-reboot which we can use for the CAS reboots as well.

I've put this fix into my ppc-for-4.2 tree which I expect to send a PR for shortly.

Comment 7 Frank Novak 2019-08-28 13:05:50 UTC
(In reply to David Gibson from comment #5)
> Ok, it turns out this is easier to fix than I anticipated.  I'd still like
> to remove CAS reboots entirely, because they cause other problems, but
> there's an easier fix for this specific problem in the meantime.
> 
> Seems s390 had a somewhat similar problem, and introduced a
> SHUTDOWN_CAUSE_SUBSYSTEM_RESET option that ignores -no-reboot which we can
> use for the CAS reboots as well.
> 
> I've put this fix into my ppc-for-4.2 tree which I expect to send a PR for
> shortly.

Cool, thanks!

Comment 8 Min Deng 2019-08-29 08:17:06 UTC
(In reply to David Gibson from comment #5)
> Ok, it turns out this is easier to fix than I anticipated.  I'd still like
> to remove CAS reboots entirely, because they cause other problems, but
> there's an easier fix for this specific problem in the meantime.
> 
> Seems s390 had a somewhat similar problem, and introduced a
> SHUTDOWN_CAUSE_SUBSYSTEM_RESET option that ignores -no-reboot which we can
> use for the CAS reboots as well.


Per David,QE also tried the problem on s390x host with the similar steps ,it seems that the problem can't be reproduced on the below build,thanks.
Build information
qemu-kvm-4.1.0-5.module+el8.1.0+4076+b5e41ebc.s390x

supermin: rpm: detected RPM version 4.14
supermin: package handler: fedora/rpm
supermin: acquiring lock on /home/libguestfs-1.40.2/tmp/.guestfs-0/lock
supermin: if-newer: output does not need rebuilding
libguestfs: finished building supermin appliance
libguestfs: begin testing qemu features
libguestfs: checking for previously cached test results of /usr/libexec/qemu-kvm, in /home/libguestfs-1.40.2/tmp/.guestfs-0
libguestfs: loading previously cached test results
libguestfs: qemu version: 4.1
libguestfs: qemu mandatory locking: yes
libguestfs: qemu KVM: enabled
libguestfs: finished testing qemu features
libguestfs: command: run: dmesg | grep -Eoh 'lpj=[[:digit:]]+'
libguestfs: read_lpj_from_dmesg: external command exited with error status 1
libguestfs: read_lpj_from_files: no boot messages files are readable
/usr/libexec/qemu-kvm \
    -global virtio-blk-ccw.scsi=off \
    -no-user-config \
    -enable-fips \
    -nodefaults \
    -display none \
    -machine accel=tcg \
    -m 768 \
    -no-reboot \
    -rtc driftfix=slew \
    -kernel /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d/kernel \
    -initrd /home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d/initrd \
    -object rng-random,filename=/dev/urandom,id=rng0 \
    -device virtio-rng-ccw,rng=rng0 \
    -device virtio-scsi-ccw,id=scsi \
    -drive file=/home/libguestfs-1.40.2/tmp/libguestfslpr3Dc/scratch1.img,cache=unsafe,format=raw,id=hd0,if=none \
    -device scsi-hd,drive=hd0 \
    -drive file=/home/libguestfs-1.40.2/tmp/.guestfs-0/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw \
    -device scsi-hd,drive=appliance \
    -device virtio-serial-ccw \
    -chardev stdio,id=charconsole0 \
    -device sclpconsole,chardev=charconsole0 \
    -chardev socket,path=/tmp/libguestfso4zkCh/guestfsd.sock,id=channel0 \
    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
    -append "panic=1 console=ttysclp0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color"
qemu-kvm: warning: global mc146818rtc.lost_tick_policy has invalid class name

...

umount-all: /proc/mounts: fsname=/dev/sda1 dir=/sysroot type=ext2 opts=rw,relatime,block_validity,barrier,user_xattr,acl freq=0 passno=0
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: umount /sysroot
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: udevadm --debug settle -E /dev/sdb
calling: settle
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: udevadm --debug settle -E /dev/sda
calling: settle
fsync /dev/sda
guestfsd: => internal_autosync (0x11a) took 0.39 secs
libguestfs: sending SIGTERM to process 20412
libguestfs: qemu maxrss 351128K
libguestfs: closing guestfs handle 0x277b6d30 (state 0)
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /home/libguestfs-1.40.2/tmp/libguestfslpr3Dc
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /tmp/libguestfso4zkCh
===== TEST FINISHED OK =====

Comment 9 David Gibson 2019-09-02 03:20:42 UTC
Upstream fix sent in pull request, just waiting for merge.

Comment 10 David Gibson 2019-09-06 01:19:43 UTC
Fix for this is now merged upstream, doing a brew build at:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=23395494

Comment 14 Min Deng 2019-09-10 06:27:56 UTC
Verified the bug on the following build
qemu-kvm-4.1.0-8.module+el8.1.0+4199+446e40fc.ppc64le
kernel-4.18.0-141.el8.ppc64le
Steps please refer to comment0 

actual results,
test can finish.
expected results,
test can finish without errors

The original issue has been fixed,thanks.

Comment 16 errata-xmlrpc 2019-11-06 07:19:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723