Bug 1375783

Summary: [ppc64] vm config with hotplugable vcpus gets broken after libvirtd restart
Product: Red Hat Enterprise Linux 7 Reporter: Peter Krempa <pkrempa>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.3CC: dyuan, dzheng, gsun, lhuang, rbalakri, tlavigne
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-2.0.0-9.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:54:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Peter Krempa 2016-09-14 04:04:59 UTC
Description of problem:
vm configuration (see below) gets broken on first libvirt restart and the VM vanishes on second VM restart. This happens only on subthreads of a core on platforms which have core level hotplug granularity.

Version-Release number of selected component (if applicable):
libvirt-2.0.0-8.el7

How reproducible:


Steps to Reproduce:
1. start a VM with hotpluggable vcpus:
  <vcpu placement='static'>24</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='2' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='3' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='4' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='5' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='6' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='7' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='8' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='9' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='10' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='11' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='12' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='13' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='14' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='15' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='16' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='17' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='18' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='19' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='20' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='21' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='22' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='23' enabled='yes' hotpluggable='yes' order='3'/>
  </vcpus>

2. restart libvirtd and dump configuration:
  <vcpu placement='static' current='3'>24</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='2' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='3' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='4' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='5' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='6' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='7' enabled='no' hotpluggable='no' order='1'/>
    <vcpu id='8' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='9' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='10' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='11' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='12' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='13' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='14' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='15' enabled='no' hotpluggable='yes' order='2'/>
    <vcpu id='16' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='17' enabled='no' hotpluggable='yes' order='3'/>
    <vcpu id='18' enabled='no' hotpluggable='yes' order='3'/>
    <vcpu id='19' enabled='no' hotpluggable='yes' order='3'/>
    <vcpu id='20' enabled='no' hotpluggable='yes' order='3'/>
    <vcpu id='21' enabled='no' hotpluggable='yes' order='3'/>
    <vcpu id='22' enabled='no' hotpluggable='yes' order='3'/>
    <vcpu id='23' enabled='no' hotpluggable='yes' order='3'/>
  </vcpus>

3. subsequent restarts result into the VM vanishing

Actual results:
See above.

Expected results:
Output after restart is identical to step 1.

Additional info:

Comment 2 Peter Krempa 2016-09-14 10:57:58 UTC
Fixed upstream:

commit 64bc75f75606d0cc48432729b4618e2eae96accc
Author: Peter Krempa <pkrempa>
Date:   Tue Sep 13 17:56:08 2016 +0200

    qemu: domain: Don't infer vcpu state
    
    Use the state information (online, hotpluggable) provided by the monitor
    code rather than trying to infer it. This fixes an issue where on
    architectures that require hotplug of multiple threads at once the
    sub-cores would get updated as offline on daemon restart thus creating
    an invalid configuration.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375783

commit 2a0e68be9185eecb9207ebdfdc4f8f0933f38bb3
Author: Peter Krempa <pkrempa>
Date:   Tue Sep 13 17:52:38 2016 +0200

    qemu: monitor: Add vcpu state information to monitor data
    
    Return whether a vcpu entry is hotpluggable or online so that upper
    layers don't have to infer the information from other data.
    
    Advantage is that this code can be tested by unit tests.

commit 66da0356cd62398b1e06de317458e8883cb32db6
Author: Peter Krempa <pkrempa>
Date:   Tue Sep 13 17:38:08 2016 +0200

    qemu: monitor: qemuMonitorGetCPUInfoHotplug: Add iterator 'anycpu'
    
    Add separate iterator for iterating all the entries

commit 03376b6da0c1e1774513048b6c992d9836600aac
Author: Peter Krempa <pkrempa>
Date:   Tue Sep 13 17:28:02 2016 +0200

    qemu: monitor: Use a more obvious iterator name
    
    The algorithm that matches data from query-cpus and
    query-hotpluggable-cpus is quite complex. Start using descriptive
    iterator names to avoid confusion.

Comment 5 Dan Zheng 2016-09-19 09:45:42 UTC
Reproduced with the package:
libvirt-2.0.0-8.el7.ppc64le


Retest packages:
libvirt-2.0.0-9.el7.ppc64le
qemu-kvm-rhev-2.6.0-25.el7.ppc64le
kernel-3.10.0-506.el7.ppc64le

1. start a VM with hotpluggable vcpus:
  <vcpu placement='static'>24</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='2' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='3' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='4' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='5' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='6' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='7' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='8' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='9' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='10' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='11' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='12' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='13' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='14' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='15' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='16' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='17' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='18' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='19' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='20' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='21' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='22' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='23' enabled='yes' hotpluggable='yes' order='3'/>
  </vcpus>
...
 <cpu mode='host-model'>
    <model fallback='allow'>power8</model>
    <topology sockets='3' cores='1' threads='8'/>
...

2. restart libvirtd and dump configuration:
3. Dumpxml guest
  <vcpu placement='static'>24</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='2' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='3' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='4' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='5' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='6' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='7' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='8' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='9' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='10' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='11' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='12' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='13' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='14' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='15' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='16' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='17' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='18' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='19' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='20' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='21' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='22' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='23' enabled='yes' hotpluggable='yes' order='3'/>
  </vcpus>
...
  <cpu mode='host-model'>
    <model fallback='allow'>power8</model>
    <topology sockets='3' cores='1' threads='8'/>

4. Repeat to restart libvirtd and result is same with step 2.

PASS.

So mark it verified.

Comment 7 errata-xmlrpc 2016-11-03 18:54:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html