Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 971313

Summary:

qemu crashes due to selinux AVC when detaching a hostdev

Product:

Red Hat Enterprise Linux 7

Reporter:

hongming <honzhang>

Component:

libvirt

Assignee:

Laine Stump <laine>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

medium

Docs Contact:

Priority:

high

Version:

7.0

CC:

acathrow, bili, dallan, dyuan, honzhang, jdenemar, jiahu, mzhan

Target Milestone:

Keywords:

TestOnly

Target Release:

---

Flags:

honzhang: needinfo-

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

libvirt-1.1.1-1.el7

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-06-13 10:29:16 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

984112

Bug Blocks:

Attachments:

Description	Flags
libvirt debug log	none
guest qemu log	none
audit log	none

Description hongming 2013-06-06 09:16:49 UTC

Description of problem:
domain crash when attach the same vf to guest again

Version-Release number of selected component (if applicable):
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.5.0-2.el7.x86_64
3.9.0-0.55.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
# lspci|grep 82576
0e:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
0e:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
0f:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
.......

# cat vf.xml
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
       <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x3'/>
      </source>
    </hostdev>


# virsh start rhel7
Domain rhel7 started

# virsh attach-device rhel7 vf.xml
Device attached successfully


# virsh detach-device rhel7 vf.xml
Device detached successfully


# virsh attach-device rhel7 vf.xml
error: Failed to attach device from vf.xml
error: Unable to read from monitor: Connection reset by peer


Actual results:
domain crash when attach the same vf to guest again

Expected results:
The domain still works fine

Additional info:

Comment 1 hongming 2013-06-06 09:18:47 UTC

Created attachment 757573 [details]
libvirt debug log

Comment 3 Laine Stump 2013-06-24 22:28:33 UTC

> error: Unable to read from monitor: Connection reset by peer

Whenever you see this error, you need to gather the guest's qemu logfile, which is in /var/log/libvirt/qemu/$guestname.log ($guestname is rhel7 in this case).

Also, can you verify that the guest remains running properly *after* the device is detached the first time, right up until the 2nd attach attempt?

Comment 4 hongming 2013-06-25 03:03:13 UTC

Created attachment 764884 [details]
guest qemu log

Attached guest qemu log

Comment 5 hongming 2013-06-25 03:16:50 UTC

The state of guest become shutoff when the device detached the first time.

# virsh start rhel7
Domain rhel7 started

# virsh attach-device rhel7 vf.xml
Device attached successfully

# virsh detach-device rhel7 vf.xml
Device detached successfully

# virsh list
 Id    Name                           State
----------------------------------------------------
 4     rhel7                          running

wait for some moment , the guest shut off 

# virsh list
 Id    Name                           State
----------------------------------------------------

Comment 6 Laine Stump 2013-06-25 17:29:41 UTC

(In reply to hongming from comment #4)
> 
> Attached guest qemu log

That's an interesting log, but it doesn't look like it is /var/log/libvirt/qemu/rhel7.log.

The qemu logfile would contain things such as the qemu commandline that was used start start qemu, and any error messages that qemu generated after it was started.

The logfile that you've attached attachment 764884 [details] contains a lot of messages from libvirtd, followed by what looks like a very strange error message about the kernel being unable to write to a pci device. I don't recognize it, so I'm asking the qemu people for assistance.

Comment 7 Laine Stump 2013-06-25 17:42:22 UTC

Can you try this with selinux disabled to see if the behavior is different?

Also check for any new AVCs in /var/log/audit/audit.log.

Comment 8 hongming 2013-06-26 09:56:11 UTC

If selinux is disabled , it works fine. the bug can't be reproduced.

If selinux is enforing , it can be reproduced.
# getenforce 
Enforcing

# virsh start rhel7
Domain rhel7 started

# virsh attach-device rhel7 vf.xml
Device attached successfully

# virsh detach-device rhel7 vf.xml
Device detached successfully

# cat /var/log/audit/audit.log|grep avc
type=AVC msg=audit(1372230114.643:139): avc:  denied  { write } for  pid=1711 comm="qemu-kvm" path="/sys/devices/pci0000:00/0000:00:1c.6/0000:0c:00.0/0000:0d:02.0/0000:0f:10.3/config" dev="sysfs" ino=27333 scontext=system_u:system_r:svirt_t:s0:c150,c740 tcontext=system_u:object_r:sysfs_t:s0 tclass=file

Comment 9 hongming 2013-06-26 09:57:37 UTC

Created attachment 765495 [details]
audit log

Comment 10 Laine Stump 2013-07-01 16:42:00 UTC

This is the offending AVC:

type=AVC msg=audit(1372240465.252:542): avc:  denied  { write } for  pid=4129 comm="qemu-kvm" path="/sys/devices/pci0000:00/0000:00:1c.6/0000:0c:00.0/0000:0d:02.0/0000:0f:10.3/config" dev="sysfs" ino=27333 scontext=system_u:system_r:svirt_t:s0:c394,c836 tcontext=system_u:object_r:sysfs_t:s0 tclass=file

The fact that this works when the device is attached, but fails when the device is detached, implies that the selinux label on this resource is being "undone" too soon.

Comment 12 Laine Stump 2013-07-02 23:23:17 UTC

I have a theory about this that is a bit disturbing - when we send the device_del command to qemu, it returns almost immediately with success, but it hasn't *really* finished detaching the device. In the meantime, we happily proceed to reattach the device to the host driver, undo any cgroups that we had setup, and relabel everything in sysfs to prevent access by the qemu process.*But it may not be finished yet!*

So I think the solution to this problem is to implement a wait for the new qemu event that it produces when it is *really* finished with a device (does anyone remember the BZ number for the libvirt side of that?

Comment 13 Laine Stump 2013-07-31 19:45:13 UTC

I believe that this bug may have been fixed by the patch for Bug 984112, which is now available in the latest RHEL7 build- libvirt-1.1.1-1.el7. Can you please retest and see if that is the case.

If it is fixed, we should re-target this back to 7.0, then mark it as fixed in libvirt-1.1.1-1.el7.

Comment 14 Laine Stump 2013-08-01 15:18:52 UTC

QA testing for Bug 984112 ran into this same problem, so it seems it isn't yet fixed. See my comment in that bug.

Comment 15 Jiri Denemark 2013-08-02 13:41:58 UTC

They hit it when trying to reproduce the bug with an older package. In any case, this is supposed to be fixed by the patches for bug 984112 and it would be a bug in those patches if not. Thus, I'm moving this back to 7.0 with a TestOnly keyword.

Comment 16 Hu Jianwei 2013-08-07 05:33:43 UTC

This bug was blocked by bug 990987, I can verify it after bug 990987 fixed.

Comment 17 Hu Jianwei 2013-12-20 02:37:08 UTC

I can't reproduce it any more.

Version:
libvirt-1.1.1-15.el7.x86_64
qemu-kvm-rhev-1.5.3-21.el7.x86_64
kernel-3.10.0-61.el7.x86_64

1. Enable vfio module in kernel
[root@sriov1 ~]#  modprobe vfio_pci
[root@sriov1 ~]# lsmod|grep vfio
vfio_iommu_type1       17636  1 
vfio_pci               36474  1 
vfio                   20777  5 vfio_iommu_type1,vfio_pci
[root@sriov1 ~]#

2. Detach VF device from host
[root@sriov1 ~]# cat net.xml 
<hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x10' function='0x4'/>
      </source>
    </hostdev>
[root@sriov1 ~]# 

[root@sriov1 ~]# virsh nodedev-dumpxml pci_0000_03_10_4
<device>
  <name>pci_0000_03_10_4</name>
  <path>/sys/devices/pci0000:00/0000:00:01.0/0000:03:10.4</path>
  <parent>pci_0000_00_01_0</parent>
  <driver>
    <name>igbvf</name>
  </driver>
...(snipped)

[root@sriov1 ~]# virsh nodedev-detach pci_0000_03_10_4
Device pci_0000_03_10_4 detached

[root@sriov1 ~]# virsh nodedev-dumpxml pci_0000_03_10_4
<device>
  <name>pci_0000_03_10_4</name>
  <path>/sys/devices/pci0000:00/0000:00:01.0/0000:03:10.4</path>
  <parent>pci_0000_00_01_0</parent>
  <driver>
    <name>vfio-pci</name>
  </driver>
...(snipped)

2. Repeat 3 times to attach/detach VF to domain r7
[root@sriov1 ~]# getenforce 
Enforcing
[root@sriov1 ~]# 

[root@sriov1 ~]# virsh attach-device r7 net.xml 
Device attached successfully

[root@sriov1 ~]# virsh detach-device r7 net.xml 
Device detached successfully

[root@sriov1 ~]# virsh attach-device r7 net.xml 
Device attached successfully

[root@sriov1 ~]# virsh detach-device r7 net.xml 
Device detached successfully

[root@sriov1 ~]# virsh attach-device r7 net.xml 
Device attached successfully

[root@sriov1 ~]# virsh detach-device r7 net.xml 
Device detached successfully

[root@sriov1 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     r7                             running


We can get the expected results, and the bug 984112 has been verified.

Comment 18 Ludek Smid 2014-06-13 10:29:16 UTC

This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.