670787 – Hot plug the 14st VF to guest causes guest shut down

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 670787 - Hot plug the 14st VF to guest causes guest shut down

Summary: Hot plug the 14st VF to guest causes guest shut down

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	6.1
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Alex Williamson
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	580954
TreeView+	depends on / blocked

Reported:	2011-01-19 12:08 UTC by zhanghaiyan
Modified:	2011-10-14 03:20 UTC (History)
CC List:	13 users (show)
Fixed In Version:	qemu-kvm-0.12.1.2-2.134.el6
Doc Type:	Bug Fix
Doc Text:	Cause: Device assignment code consumes resources from a fixed pool for each memory ranges used by an assigned device. Consequence: When the resource pool is exhausted, the VM exits. Fix: Limit number of devices that may be assigned to a VM to avoid running out of resources. Limit set to 8 devices. Result: Adding assigned devices to a VM can no longer trigger an unexpected shutdown of the VM.
Clone Of:
Clones:	678368 (view as bug list)
Environment:
Last Closed:	2011-05-19 11:30:54 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0534	0	normal	SHIPPED_LIVE	Important: qemu-kvm security, bug fix, and enhancement update	2011-05-19 11:20:36 UTC

Description zhanghaiyan 2011-01-19 12:08:03 UTC

Description of problem:
Hot plug the 14st VF to guest causes guest shut down while the previous 13 VFs can be hotplugged to guest successfully

Version-Release number of selected component (if applicable):
- 2.6.32-94.el6.x86_64
- libvirt-0.8.7-2.el6.x86_64
- qemu-kvm-0.12.1.2-2.129.el6.x86_64

How reproducible:
3/3

Steps to Reproduce:
1. # rmmod igb
2. # modprobe igb max_vfs=7
3. # virsh nodedev-list --tree
computer
 |
  +- net_lo_00_00_00_00_00_00
  +- pci_0000_00_00_0
  +- pci_0000_00_01_0
  |   |
  |   +- pci_0000_03_00_0
  |   |   |
  |   |   +- net_eth0_00_1b_21_39_8b_18
  |   |     
  |   +- pci_0000_03_00_1
  |   |   |
  |   |   +- net_eth1_00_1b_21_39_8b_19
  |   |     
  |   +- pci_0000_03_10_0
  |   |   |
  |   |   +- net_eth5_2a_cc_b2_a1_da_67
  |   |     
  |   +- pci_0000_03_10_1
  |   |   |
  |   |   +- net_eth7_a2_1f_b9_53_1e_04
  |   |     
  |   +- pci_0000_03_10_2
  |   |   |
  |   |   +- net_eth9_56_41_9c_f4_3c_15
  |   |     
  |   +- pci_0000_03_10_3
  |   |   |
  |   |   +- net_eth8_ea_56_5d_1f_de_a2
  |   |     
  |   +- pci_0000_03_10_4
  |   |   |
  |   |   +- net_eth10_5a_9c_c4_10_f5_a4
  |   |     
  |   +- pci_0000_03_10_5
  |   |   |
  |   |   +- net_eth11_2a_a9_85_86_db_72
  |   |     
  |   +- pci_0000_03_10_6
  |   |   |
  |   |   +- net_eth16_52_8f_73_02_5f_f9
  |   |     
  |   +- pci_0000_03_10_7
  |   |   |
  |   |   +- net_eth13_5a_c4_2d_cb_42_8d
  |   |     
  |   +- pci_0000_03_11_0
  |   |   |
  |   |   +- net_eth12_66_1c_e7_92_4a_4b
  |   |     
  |   +- pci_0000_03_11_1
  |   |   |
  |   |   +- net_eth15_ae_36_43_f0_e1_6d
  |   |     
  |   +- pci_0000_03_11_2
  |   |   |
  |   |   +- net_eth18_72_eb_5b_7d_be_93
  |   |     
  |   +- pci_0000_03_11_3
  |   |   |
  |   |   +- net_eth19_f2_12_af_86_39_00
  |   |     
  |   +- pci_0000_03_11_4
  |   |   |
  |   |   +- net_eth14_b2_b6_cd_e0_8c_f2
  |   |     
  |   +- pci_0000_03_11_5
  |       |
  |       +- net_eth17_b2_47_c6_22_8b_ec
4.  Detach 14 VFs from host
# for i in {0..7}; do virsh nodedev-dettach pci_0000_03_10_$i; done
Device pci_0000_03_10_0 dettached

Device pci_0000_03_10_1 dettached

Device pci_0000_03_10_2 dettached

Device pci_0000_03_10_3 dettached

Device pci_0000_03_10_4 dettached

Device pci_0000_03_10_5 dettached

Device pci_0000_03_10_6 dettached

Device pci_0000_03_10_7 dettached

# for i in {0..5}; do virsh nodedev-dettach pci_0000_03_11_$i; done
Device pci_0000_03_11_0 dettached

Device pci_0000_03_11_1 dettached

Device pci_0000_03_11_2 dettached

Device pci_0000_03_11_3 dettached

Device pci_0000_03_11_4 dettached

Device pci_0000_03_11_5 dettached

5. Reset 14 VFs from host
6. Prepare VF*.xml for each VF
# ls VF_*
VF_10_0.xml  VF_10_2.xml  VF_10_4.xml  VF_10_6.xml  VF_11_0.xml  VF_11_2.xml  VF_11_4.xml
VF_10_1.xml  VF_10_3.xml  VF_10_5.xml  VF_10_7.xml  VF_11_1.xml  VF_11_3.xml  VF_11_5.xml
# cat VF_10_0.xml
<hostdev mode='subsystem' type='pci'>
       <source>
            <address bus='3' slot='0x10' function='0'/>
       </source>
</hostdev>  
7. Hotplug the previous 13 VFs to guest
# virsh start rhel6
Domain rhel6 started
# for i in {0..7}; do virsh attach-device rhel6 VF_10_$i.xml; done
Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

# for i in {0..4}; do virsh attach-device rhel6 VF_11_$i.xml; done
Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

Device attached successfully

8. Hotplug the 14st VF to guest
 # virsh attach-device rhel6 VF_11_5.xml
Device attached successfully

Actual results:
step7, all 13 VFs could be hotplugged to guest successfully and could get ip in guest
step8, the guest is caused to shut down.

Expected results:
step8, the 14st VF should be able to hotplugged to guest successfully and could get ip in guest as the previous 13 VFs.
Because guest could be able to support 32 pci slot, and before hot plug VFs to guest, existing pci device only occupied 5 pci slot. So should could hotplug 27 pci device to guest. For example, indeed I can hotplug 27 virtual network to guest successfully.

Additional info:

Comment 1 Daniel Berrangé 2011-01-19 12:13:01 UTC

Please provide the QEMU logfile /var/log/libvirt/qemu/$GUEST.log

You're not hitting the PCI slot limit, but there are other limits that might be hit. Hopefully QEMU gave some indication in the log.

Also if possible try and get a stack trace from QEMU itself when it (likely) crashes.

eg, before step 8, do

  # debuginfo-install qemu-kvm
  # gdb
  (gdb) attach ...PID of QEMU..

..then run step 8...

if GDB catches a crash, then run  'thread apply all bt' in GDB.

Comment 3 zhanghaiyan 2011-01-19 12:19:44 UTC

# cat /var/log/libvirt/qemu/rhel6.log
2011-01-19 19:34:48.269: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M rhel6.0.0 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name rhel6 -uuid 11b761ff-19f2-b83f-8de5-c3db28ed413c -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/rhel6.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=utc -boot c -drive file=/var/lib/libvirt/images/test.img,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/var/lib/libvirt/images/fd.img,if=none,id=drive-fdc0-0-0,format=raw -global isa-fdc.driveA=drive-fdc0-0-0 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
19:34:48.275: 6201: debug : virCgroupNew:555 : New group /libvirt/qemu/rhel6
19:34:48.275: 6201: debug : virCgroupDetect:245 : Detected mount/mapping 0:cpu at /cgroup/cpu in
19:34:48.275: 6201: debug : virCgroupDetect:245 : Detected mount/mapping 1:cpuacct at /cgroup/cpuacct in
19:34:48.275: 6201: debug : virCgroupDetect:245 : Detected mount/mapping 2:cpuset at /cgroup/cpuset in
19:34:48.275: 6201: debug : virCgroupDetect:245 : Detected mount/mapping 3:memory at /cgroup/memory in
19:34:48.275: 6201: debug : virCgroupDetect:245 : Detected mount/mapping 4:devices at /cgroup/devices in
19:34:48.275: 6201: debug : virCgroupDetect:245 : Detected mount/mapping 5:freezer at /cgroup/freezer in
19:34:48.275: 6201: debug : virCgroupMakeGroup:497 : Make group /libvirt/qemu/rhel6
19:34:48.275: 6201: debug : virCgroupMakeGroup:509 : Make controller /cgroup/cpu/libvirt/qemu/rhel6/
19:34:48.275: 6201: debug : virCgroupMakeGroup:509 : Make controller /cgroup/cpuacct/libvirt/qemu/rhel6/
19:34:48.275: 6201: debug : virCgroupMakeGroup:509 : Make controller /cgroup/cpuset/libvirt/qemu/rhel6/
19:34:48.275: 6201: debug : virCgroupMakeGroup:509 : Make controller /cgroup/memory/libvirt/qemu/rhel6/
19:34:48.275: 6201: debug : virCgroupMakeGroup:509 : Make controller /cgroup/devices/libvirt/qemu/rhel6/
19:34:48.275: 6201: debug : virCgroupMakeGroup:509 : Make controller /cgroup/freezer/libvirt/qemu/rhel6/
19:34:48.276: 6201: debug : virCgroupSetValueStr:290 : Set value '/cgroup/cpu/libvirt/qemu/rhel6/tasks' to '6201'
19:34:48.281: 6201: debug : virCgroupSetValueStr:290 : Set value '/cgroup/cpuacct/libvirt/qemu/rhel6/tasks' to '6201'
19:34:48.288: 6201: debug : virCgroupSetValueStr:290 : Set value '/cgroup/cpuset/libvirt/qemu/rhel6/tasks' to '6201'
19:34:48.296: 6201: debug : virCgroupSetValueStr:290 : Set value '/cgroup/memory/libvirt/qemu/rhel6/tasks' to '6201'
19:34:48.304: 6201: debug : virCgroupSetValueStr:290 : Set value '/cgroup/devices/libvirt/qemu/rhel6/tasks' to '6201'
19:34:48.312: 6201: debug : virCgroupSetValueStr:290 : Set value '/cgroup/freezer/libvirt/qemu/rhel6/tasks' to '6201'
19:34:48.320: 6201: debug : qemudInitCpuAffinity:2327 : Setting CPU affinity
19:34:48.321: 6201: debug : qemuSecurityDACSetProcessLabel:547 : Dropping privileges of VM to 107:107
char device redirected to /dev/pts/1
create_userspace_phys_mem: Invalid argument
assigned_dev_iomem_map: Error: create new mapping failed
2011-01-19 19:39:15.292: shutting down

Comment 4 Daniel Berrangé 2011-01-19 12:32:54 UTC

Ok, no need for the GDB trace, this is a clear QEMU bug.

It fails to create a new mapping for the 14th device in:

static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num,
                                   pcibus_t e_phys, pcibus_t e_size, int type)

And just does this totally bogus error handling

    if (ret != 0) {
	fprintf(stderr, "%s: Error: create new mapping failed\n", __func__);
	exit(1);
    }

It needs to report the error back to the monitor, not simply exit.

Comment 5 Alex Williamson 2011-01-19 19:25:54 UTC

kvm_register_phys_mem() fails because we exceed the limit of 32 memory slots.  Reporting the error back is not necessarily trivial either since at this point the device has already been assigned to the guest, and the guest is mapping the PCI BARs.  Perhaps we need to pre-map the BARs in the initfn so we can find the error there.  Trouble is the guest owns the address space at that point, so I'm not sure where the BAR should be assigned.

Comment 22 juzhang 2011-02-21 05:21:42 UTC

According to comment18 and comment21,this issue has been fixed.

Comment 23 Alex Williamson 2011-05-05 16:37:51 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:
Device assignment code consumes resources from a fixed pool for each memory ranges used by an assigned device.
Consequence:
When the resource pool is exhausted, the VM exits.
Fix:
Limit number of devices that may be assigned to a VM to avoid running out of resources.  Limit set to 8 devices.
Result:
Adding assigned devices to a VM can no longer trigger an unexpected shutdown of the VM.

Comment 24 errata-xmlrpc 2011-05-19 11:30:54 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Comment 25 errata-xmlrpc 2011-05-19 13:01:03 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Note You need to log in before you can comment on or make changes to this bug.