RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 979988 - The error message should be improved in some pci assignment scenario with VFIO driver
Summary: The error message should be improved in some pci assignment scenario with VFI...
Keywords:
Status: CLOSED DUPLICATE of bug 1001738
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-01 09:28 UTC by Xuesong Zhang
Modified: 2013-09-27 10:42 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-27 10:42:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Xuesong Zhang 2013-07-01 09:28:57 UTC
Description of problem:
There are 2 PF devices in the same iommu group, keep the driver as igb. The guest, with 1 PF with VFIO driver, will be failed to be booted up as expect. But the error message is not clearly, should be improved.

Version-Release number of selected component (if applicable):
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.5.0-2.el7.x86_64
3.10.0-0.rc6.63.el7.x86_64

How reproducible:
100%

Steps:
1. prepare one env, that the iommu_group contains 2 network card.
# lspci |grep 82576
03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
# readlink -f /sys/bus/pci/devices/0000\:03\:00.0/iommu_group/
/sys/kernel/iommu_groups/14
# ls /sys/kernel/iommu_groups/14/devices/
0000:03:00.0  0000:03:00.1

2. add one network card to the guest with vfio driver, note the managed="yes".
<hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
    </hostdev>

3. didn't change the driver of another network card, the guest will failed to boot up with error.
# virsh start rhel7
error: Failed to start domain rhel7
error: Unable to read from monitor: Connection reset by peer

4. edit the dumpxml of this guest, change managed from "yes" to "no"
<hostdev mode='subsystem' type='pci' managed='no'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
    </hostdev>
5. start the guest again.
# virsh start rhel7
error: Failed to start domain rhel7
error: Unable to allow access for device path /dev/vfio/14: No such file or directory

6. change the driver of another PCI device (which is not assigned in the guest, 0000:03:00.1)
# virsh nodedev-detach pci_0000_03_00_1 --driver vfio
Device pci_0000_03_00_1 detached
# readlink -f /sys/bus/pci/devices/0000\:03\:00.1/driver/
/sys/bus/pci/drivers/vfio-pci

7. start the guest 3rd time.
# virsh start rhel7
error: Failed to start domain rhel7
error: Unable to read from monitor: Connection reset by peer


Actual results:
The error message in step 3 and step 7 are not so clearly.

Expected results:
The error message in step 3 and step 7 should be improved, such as the error message in step 5 for reference.

Comment 2 Jiri Denemark 2013-07-02 20:22:44 UTC
The error messages in steps 3 and 7 say that QEMU started but crashed
afterwards. I'm not sure if it's feasible to provide more info from
the QEMU error output in such case.

Comment 3 Laine Stump 2013-07-03 04:41:11 UTC
This falls into the general category of libvirt needing to scrape the qemu log for error messages. Isn't there a BZ about that somewhere? I think Cole filed it...

The problem with libvirt detecting the failure condition (that there are devices in the same group that haven't been bound to vfio-pci or pci-stub) is that it encodes the same rules into libvirt that are already in the kernel, so 1) the same logic is in two places, and 2) if that logic changes in the kernel, then libvirt may still prevent some operations that the kernel actually allows (for example if the kernel gets a switch to loosen restrictions on grouped devices). (Here is the current logic in the kernel: it requires all other devices in the same group to be bound to vfio-pci, pci-stub, or pcieport, and to be either assigned to the same guest, or to no guest at all).

Comment 4 Xuesong Zhang 2013-07-22 08:30:28 UTC
Add the following message for your reference.

Keep other steps same as the description, only change the libvirt command in step 3 and 7 to qemu command, the error message in qemu is clearly for reference. Below is the command and error message in qemu:

 /usr/libexec/qemu-kvm -name kvm-rhel7.0-x86_64-qcow2-virtio -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 668424dc-b029-48f2-8947-ab9c1dca2a1e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvm-rhel7.0-x86_64-qcow2-virtio.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/kvm-rhel6.4-x86_64-qcow2.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -device vfio-pci,host=03:00.0
char device redirected to /dev/pts/1 (label charserial0)
qemu-kvm: -device vfio-pci,host=03:00.0: vfio: error opening /dev/vfio/14: No such file or directory
qemu-kvm: -device vfio-pci,host=03:00.0: vfio: failed to get group 14
qemu-kvm: -device vfio-pci,host=03:00.0: Device initialization failed.
qemu-kvm: -device vfio-pci,host=03:00.0: Device 'vfio-pci' could not be initialized

Comment 5 Laine Stump 2013-09-27 10:42:24 UTC
I had forgotten about this BZ when I filed Bug 1001738. Since Peter has already been working on the BZ, I'll mark this one as a duplicate.

*** This bug has been marked as a duplicate of bug 1001738 ***


Note You need to log in before you can comment on or make changes to this bug.