Bug 1285337

Summary: [PPC64LE] Guest freezes if qemu allocates smaller page table than requested
Product: Red Hat Enterprise Linux 7 Reporter: Jan Kurik <jkurik>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.2CC: bugs, dgibson, gklein, hannsj_uhl, jen, jkurik, knoel, mazhang, michal.skrivanek, michen, mrezanin, mshira, qzhang, shuyu, snagar, virt-maint, xuhan, xuma, zhengtli
Target Milestone: rcKeywords: Automation, ZStream
Target Release: ---   
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.3.0-31.el7_2.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1282833 Environment:
Last Closed: 2015-12-16 22:50:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1282833    
Bug Blocks: 1201513    

Description Jan Kurik 2015-11-25 12:24:28 UTC
This bug has been copied from bug #1282833 and has been proposed
to be backported to 7.2 z-stream (EUS).

Comment 3 Jeff Nelson 2015-11-30 22:01:12 UTC
Fix included in qemu-kvm-rhev-2.3.0-31.el7_2_2.4

Comment 4 Jeff Nelson 2015-11-30 22:04:23 UTC
Correcting NVR:

Fix included in qemu-kvm-rhev-2.3.0-31.el7_2.4

Comment 6 Qunfang Zhang 2015-12-02 07:10:13 UTC
I can reproduce this bug on the old version qemu-kvm-rhev-2.3.0-31.el7.ppc64le and qemu-kvm-rhev-2.3.0-31.el7_2.3.ppc64le.

1. On host A:

kernel-3.10.0-327.el7.ppc64le
qemu-kvm-rhev-2.3.0-31.el7.ppc64le (The host is using by others and I don't update the packages)

Host: 256G mem

Test scenarios:

Boot a VM1 first and it boots up successfully. Then start booting VM2 and check whether VM2 succeeds.

VM1: -m 60G,slots=16,maxmem=1024G 

1) VM2: -m 60G,slots=16,maxmem=1024G ==> Reproduced 
2) VM2:  -m 4G,slots=4,maxmem=1024G  ==> Pass
3) VM2: -m 16G,slots=4,maxmem=1024G  ==> Pass
4) VM2 :  -m 32G,slots=4,maxmem=1024G  ==> Pass
5)VM1: -m 60G  VM2: -m 60G  ==> Pass


2. On host B:

kernel-3.10.0-327.2.1.el7.ppc64le
qemu-kvm-rhev-2.3.0-31.el7_2.3.ppc64le

Host: 128G mem

(1)  Only boot VM1:  -m 60G,slots=16,maxmem=1024G  ==> Reproduced
(2)  Only VM1:        -m 60G,slots=16,maxmem=512G ==> Pass
(3)
VM1:  -m 32G,slots=4,maxmem=1024G
VM2:   -m 32G,slots=4,maxmem=1024G  ==> Pass

Comment 7 Qunfang Zhang 2015-12-02 07:11:02 UTC
 /usr/libexec/qemu-kvm -name test -machine pseries,accel=kvm,usb=off -m 60G,slots=16,maxmem=1024G -smp 1,sockets=1,cores=1,threads=1 -uuid 1212b7e2-f341-4f8c-80e8-59e2968d85c2 -realtime mlock=off -nodefaults -serial stdio -rtc base=utc -device spapr-vscsi,id=scsi0,reg=0x1000 -drive file=rhel7.2-virtio_blk-le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,drive=drive-scsi0-0-0-0,bootindex=1,id=scsi0-0-0-0  -drive if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,drive=drive-scsi0-0-1-0,bootindex=2,id=scsi0-0-1-0 -msg timestamp=on -usb -device usb-tablet,id=tablet1  -qmp tcp:0:6666,server,nowait -netdev tap,id=hostnet1,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:5a:5f:5b:5b
VNC server running on `::1:5900'


SLOF **********************************************************************
QEMU Starting
 Build Date = Sep 18 2015 06:25:39
 FW Version = mockbuild@ release 20150313
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/v-scsi@1000
       SCSI: Looking for devices
          8000000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.3."
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
                     00 1000 (D) : 1af4 1000    virtio [ net ]
                     00 0800 (D) : 1af4 1001    virtio [ block ]
                     00 0000 (D) : 106b 003f    serial bus [ usb-ohci ]
No NVRAM common partition, re-initializing...
Scanning USB 
  OHCI: initializing
Using default console: /vdevice/vty@71000000
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/scsi@1 ...   Successfully loaded





      Red Hat Enterprise Linux Server (3.10.0-326.el7.ppc64le) 7.2 (Maipo)      
      Red Hat Enterprise Linux Server (3.10.0-316.el7.ppc64le) 7.2 (Maipo)     
      Red Hat Enterprise Linux Server (0-rescue-2335bae863a843baa1bfb8503ab94e>
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                               
                                                                                

      Use the ^ and v keys to change the selection.                       
      Press 'e' to edit the selected item, or 'c' for a command prompt.   
   The selected entry will be started automatically in 0s.                     



OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 3.10.0-326.el7.ppc64le (mockbuild.eng.bos.redhat.com) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Fri Oct 23 11:14:00 EDT 2015
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: BOOT_IMAGE=/vmlinuz-3.10.0-326.el7.ppc64le root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet LANG=en_US.UTF-8
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000005500000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000200000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000200000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000005510000 -> 0x0000000005510a5a
Device tree struct  0x0000000005520000 -> 0x0000000005560000
Calling quiesce...
returning from prom_init

^^^^ Guest stuck here

Comment 8 Qunfang Zhang 2015-12-02 07:13:20 UTC
Verified this bug on qemu-kvm-rhev-2.3.0-31.el7_2.4.ppc64le with the same steps and qemu command line. The issue does not exist any more.

Guest could not start and there is a prompt when the maxmem is too large:


# /usr/libexec/qemu-kvm -name test -machine pseries,accel=kvm,usb=off -m 60G,slots=16,maxmem=1024G -smp 1,sockets=1,cores=1,threads=1 -uuid 1212b7e2-f341-4f8c-80e8-59e2968d85c2 -realtime mlock=off -nodefaults -serial stdio -rtc base=utc -device spapr-vscsi,id=scsi0,reg=0x1000 -drive file=rhel7.2-virtio_blk-le-new.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,drive=drive-scsi0-0-0-0,bootindex=1,id=scsi0-0-0-0  -drive if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,drive=drive-scsi0-0-1-0,bootindex=2,id=scsi0-0-1-0 -msg timestamp=on -usb -device usb-tablet,id=tablet1  -qmp tcp:0:6886,server,nowait -netdev tap,id=hostnet1,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:5a:5f:5b:51
2015-12-02T07:03:44.767491Z qemu-kvm: Failed to allocate HTAB of requested size, try with smaller maxmem
Aborted (core dumped)

So, this bug is fixed.

Comment 10 Qunfang Zhang 2015-12-03 02:50:29 UTC
Setting to VERIFIED according to comment 6, comment 7 and comment 8.

Comment 12 errata-xmlrpc 2015-12-16 22:50:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2663.html