Bug 1357808

Summary: TCG defaults to POWER7 cpu which won't run modern distributions
Product: Red Hat Enterprise Linux 7 Reporter: Andrea Bolognani <abologna>
Component: qemu-kvm-rhevAssignee: Thomas Huth <thuth>
Status: CLOSED ERRATA QA Contact: Xu Han <xuhan>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: dgibson, dzheng, gsun, knoel, mrezanin, qzhang, rjones, thuth, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.8.0-2.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 23:32:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1390734, 1435528    
Bug Blocks:    
Attachments:
Description Flags
Full boot log of a RHEL 7.3 guest with TCG none

Description Andrea Bolognani 2016-07-19 08:41:02 UTC
Created attachment 1181501 [details]
Full boot log of a RHEL 7.3 guest with TCG

QEMU can't boot a RHEL 7.3 guest in TCG mode.

To test this, just take any RHEL 7.3 libvirt guest and change

  <domain type='kvm'>

to

  <domain type='qemu'>

in its configuration.

Another quick way to test this is to run

  $ export LIBGUESTFS_BACKEND_SETTINGS=force_tcg
  $ libguestfs-test-tool

on a RHEL 7.3 host, as the libguestfs appliance will be built
by copying files from the host system.

This doesn't happen for RHEL 7.2 guests, but it does for RHEL
7.3 guests running a 7.2 kernel, so it looks like TCG doesn't
play well with the RHEL 7.3 userspace. The same also happens
with Fedora 24.

[    4.045659] Freeing unused kernel memory: 4288K (c000000000c70000 - c0000000010a0000)
[    4.105778] init[1]: unhandled signal 4 at 00003fff854b3e60 nip 00003fff854b3e60 lr 00003fff85497910 code 30001
[    4.116966] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[    4.116966] 
[    4.117463] CPU: 0 PID: 1 Comm: init Not tainted 3.10.0-456.el7.ppc64le #1
[    4.117651] Call Trace:
[    4.118035] [c00000004eae3990] [c0000000000178b0] show_stack+0x80/0x330 (unreliable)
[    4.118182] [c00000004eae3a40] [c0000000009a36a4] dump_stack+0x30/0x44
[    4.118227] [c00000004eae3a60] [c0000000009997e4] panic+0x130/0x2a4
[    4.118265] [c00000004eae3af0] [c0000000000e00e4] do_exit+0xb64/0xb70
[    4.118302] [c00000004eae3be0] [c0000000000e01ac] do_group_exit+0x5c/0x100
[    4.118340] [c00000004eae3c20] [c0000000000f9100] get_signal_to_deliver+0x210/0x9b0
[    4.118381] [c00000004eae3d10] [c0000000000183dc] do_signal+0x5c/0x320
[    4.118418] [c00000004eae3e00] [c0000000000187fc] do_notify_resume+0x8c/0x100
[    4.118536] [c00000004eae3e30] [c00000000000a7b0] ret_from_except_lite+0x5c/0x60
[    4.125139] Rebooting in 10 seconds..

I'm attaching the full boot log.

kernel-3.10.0-469.el7.ppc64le
qemu-kvm-rhev-2.6.0-13.el7.ppc64le
libvirt-daemon-2.0.0-2.el7.ppc64le

Comment 2 Thomas Huth 2016-07-19 10:52:41 UTC
Blind guess: You've started your guest with a POWER7 CPU (which is the default for ppc TCG mode). Since 7.3 is compiled for POWER8 only, some instructions cause an illegal opcode exception and thus the guest crashes. Please try to start your guest with POWER8 to see whether this fixes this issue.

Comment 3 Andrea Bolognani 2016-07-19 11:48:43 UTC
(In reply to Thomas Huth from comment #2)
> Blind guess: You've started your guest with a POWER7 CPU (which is the
> default for ppc TCG mode). Since 7.3 is compiled for POWER8 only, some
> instructions cause an illegal opcode exception and thus the guest crashes.
> Please try to start your guest with POWER8 to see whether this fixes this
> issue.

You guessed right! :)

If I add

  <cpu mode='custom' match='exact'>
    <model fallback='allow'>POWER8</model>
  </cpu>

to the guest configuration, it can boot fine. Otherwise, I get

  /device[1] (POWER7_v2.3-powerpc64-cpu)

in the QOM tree and the machine doesn't boot.

Do you think it would be a good idea to change the default
from POWER7_v2.3 to POWER8?

If not, libguestfs (and potentially other higher-level tools)
will have to cope with this somehow.

Comment 4 Richard W.M. Jones 2016-07-19 12:43:02 UTC
Just a note about libguestfs behaviour:

libguestfs, in "force_tcg" mode, passes the libvirt XML below which is translated
to the qemu command line at bottom.  Libguestfs does not specify a <cpu> model at
all and no -cpu parameter is passed by libvirt.

Libguestfs working normally (when TCG is not being forced) will pass <cpu>host</cpu>
to libvirt.

libvirt-daemon-2.0.0-2.el7.ppc64le

<?xml version="1.0"?>
<domain type="qemu" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">
  <name>guestfs-l9zm28pmr5cpsnyq</name>
  <memory unit="MiB">768</memory>
  <currentMemory unit="MiB">768</currentMemory>
  <vcpu>1</vcpu>
  <clock offset="utc">
    <timer name="rtc" tickpolicy="catchup"/>
    <timer name="pit" tickpolicy="delay"/>
  </clock>
  <os>
    <type machine="pseries">hvm</type>
    <kernel>/var/tmp/.guestfs-1000/appliance.d/kernel</kernel>
    <initrd>/var/tmp/.guestfs-1000/appliance.d/initrd</initrd>
    <cmdline>panic=1 console=hvc0 console=ttyS0 udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color</cmdline>
  </os>
  <on_reboot>destroy</on_reboot>
  <devices>
    <rng model="virtio">
      <backend model="random">/dev/urandom</backend>
    </rng>
    <controller type="scsi" index="0" model="virtio-scsi"/>
    <disk device="disk" type="file">
      <source file="/tmp/libguestfs7DM1oN/scratch.1"/>
      <target dev="sda" bus="scsi"/>
      <driver name="qemu" type="raw" cache="unsafe"/>
      <address type="drive" controller="0" bus="0" target="0" unit="0"/>
    </disk>
    <disk type="file" device="disk">
      <source file="/tmp/libguestfs7DM1oN/overlay2"/>
      <target dev="sdb" bus="scsi"/>
      <driver name="qemu" type="qcow2" cache="unsafe"/>
      <address type="drive" controller="0" bus="0" target="1" unit="0"/>
      <shareable/>
    </disk>
    <serial type="unix">
      <source mode="connect" path="/tmp/libguestfs7DM1oN/console.sock"/>
      <target port="0"/>
    </serial>
    <channel type="unix">
      <source mode="connect" path="/tmp/libguestfs7DM1oN/guestfsd.sock"/>
      <target type="virtio" name="org.libguestfs.channel.0"/>
    </channel>
  </devices>
  <qemu:commandline>
    <qemu:env name="TMPDIR" value="/var/tmp"/>
  </qemu:commandline>
</domain>

LC_ALL=C PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/rjones/.local/bin:/home/rjones/bin HOME=/home/rjones USER=rjones LOGNAME=rjones QEMU_AUDIO_DRV=none TMPDIR=/var/tmp /usr/libexec/qemu-kvm -name guest=guestfs-l9zm28pmr5cpsnyq,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/home/rjones/.config/libvirt/qemu/lib/domain-2-guestfs-l9zm28pmr5cp/master-key.aes -machine pseries-rhel7.3.0,accel=tcg,usb=off -m 768 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 576c7977-690c-421e-900d-8b377d596061 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/home/rjones/.config/libvirt/qemu/lib/domain-2-guestfs-l9zm28pmr5cp/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-reboot -boot strict=on -kernel /var/tmp/.guestfs-1000/appliance.d/kernel -initrd /var/tmp/.guestfs-1000/appliance.d/initrd -append 'panic=1 console=hvc0 console=ttyS0 udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm-256color' -device pci-ohci,id=usb,bus=pci.0,addr=0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x1 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/tmp/libguestfs7DM1oN/scratch.1,format=raw,if=none,id=drive-scsi0-0-0-0,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive file=/tmp/libguestfs7DM1oN/overlay2,format=qcow2,if=none,id=drive-scsi0-0-1-0,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0 -chardev socket,id=charserial0,path=/tmp/libguestfs7DM1oN/console.sock -device spapr-vty,chardev=charserial0,reg=0x30000000 -chardev socket,id=charchannel0,path=/tmp/libguestfs7DM1oN/guestfsd.sock -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x5 -msg timestamp=on

Comment 9 Thomas Huth 2016-10-05 14:31:13 UTC
Suggested a patch upstream here:
http://marc.info/?i=1475653491-8611-1-git-send-email-thuth@redhat.com

Comment 10 David Gibson 2016-11-27 22:55:28 UTC
Right, the default has already change to POWER8 in current upstream master, and so will be POWER8 in qemu-2.8.  Thus it will also be POWER8 for RHEL7.4.

Is that going to be sufficient, or do we need to look at a 7.2.z / 7.3.z fix for the libguestfs case?

Comment 11 Richard W.M. Jones 2016-11-28 08:38:26 UTC
(In reply to David Gibson from comment #10)
> Right, the default has already change to POWER8 in current upstream master,
> and so will be POWER8 in qemu-2.8.  Thus it will also be POWER8 for RHEL7.4.

I think switching the default CPU to POWER8 in RHEL 7.4 should be fine.

Comment 12 David Gibson 2016-11-30 04:11:48 UTC
Ok, in that case we should get the fix with the 7.4 rebase.

Comment 15 David Gibson 2017-01-06 00:47:31 UTC
Laurent has now posted a fix for bug 1390734 which should also address this bug.

Comment 17 Xu Han 2017-02-27 10:44:14 UTC
Verified this bug with qemu-kvm-rhev-2.8.0-5.el7.

Results:
# /usr/libexec/qemu-kvm -nodefaults -monitor stdio -machine pseries-rhel7.2.0,accel=tcg
(qemu) info qom-tree 
/machine (pseries-rhel7.2.0-machine)
...
    /device[1] (POWER7_v2.3-powerpc64-cpu)


# /usr/libexec/qemu-kvm -nodefaults -monitor stdio -machine pseries-rhel7.3.0,accel=tcg
(qemu) info qom-tree 
/machine (pseries-rhel7.3.0-machine)
...
    /device[1] (POWER7_v2.3-spapr-cpu-core)
      /thread[0] (POWER7_v2.3-powerpc64-cpu)


# /usr/libexec/qemu-kvm -nodefaults -monitor stdio -machine pseries-rhel7.4.0,accel=tcg
(qemu) info qom-tree 
/machine (pseries-rhel7.4.0-machine)
...
    /device[1] (POWER8_v2.0-spapr-cpu-core)
      /thread[0] (POWER8_v2.0-powerpc64-cpu)


So the bug has been fixed.

Comment 19 errata-xmlrpc 2017-08-01 23:32:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 20 errata-xmlrpc 2017-08-02 01:09:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 21 errata-xmlrpc 2017-08-02 02:01:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 22 errata-xmlrpc 2017-08-02 02:42:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 23 errata-xmlrpc 2017-08-02 03:07:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 24 errata-xmlrpc 2017-08-02 03:27:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392