Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1804672

Summary: Metadata locking fails on root-squashed NFS when starting a VM
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Peter Krempa <pkrempa>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: gaojianan <jgao>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.2CC: eslutsky, hhan, jdenemar, jiyan, jsuchane, lmen, mprivozn, virt-maint, xuzhang, yafu, ymankad
Target Milestone: rcKeywords: Upstream
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-6.0.0-7.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-05 09:57:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1795672    
Attachments:
Description Flags
VM XML
none
VM Log none

Description Peter Krempa 2020-02-19 11:46:17 UTC
Description of problem:
RHV tries to start a VM which has a disk hosted on root-squashed NFS. The startup fails with:

2020-02-19 10:18:48.266+0000: 19879: error : virProcessRunInFork:1161 : internal error: child reported (status=125): unable to open /var/run/vdsm/storage/74b3c4fa-c4cd-4c9b-a251-c099cc63ed51/3d092b2d-9668-4099-bf90-6020dfded169/3ddaeade-8b68-4b69-9f0f-ca3ca04e8d19: Permission denied

The disk definition is:

    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads' iothread='1'/>
      <source file='/var/run/vdsm/storage/74b3c4fa-c4cd-4c9b-a251-c099cc63ed51/3d092b2d-9668-4099-bf90-6020dfded169/3ddaeade-8b68-4b69-9f0f-ca3ca04e8d19'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <serial>3d092b2d-9668-4099-bf90-6020dfded169</serial>
      <alias name='ua-3d092b2d-9668-4099-bf90-6020dfded169'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </disk>

The only instance of such error message in libvirtd code is in 
virSecurityManagerMetadataLock. The code is executed as selinux relabelling was not disabled.

NFS is mounted as
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.35.80.5,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=10.35.80.5)


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
https://bugzilla.redhat.com/show_bug.cgi?id=1795672

Comment 1 Evgeny Slutsky 2020-02-19 12:03:00 UTC
Created attachment 1664018 [details]
VM XML

Comment 2 Evgeny Slutsky 2020-02-19 12:03:59 UTC
Created attachment 1664019 [details]
VM Log

Comment 3 Sandro Bonazzola 2020-02-20 10:37:01 UTC
Proposing as blocker since it breaks RHV

Comment 5 Michal Privoznik 2020-02-21 09:34:47 UTC
Patches proposed upstream:

https://www.redhat.com/archives/libvir-list/2020-February/msg00734.html

Comment 8 Michal Privoznik 2020-02-25 10:18:37 UTC
I've pushed patches upstream:

f16663d58f security: Don't fail if locking a file on NFS mount fails
5fddf61351 security: Don't remember seclabel for paths we haven't locked successfully
256e01e59e virSecurityManagerMetadataLock: Store locked paths

v6.0.0-510-gf16663d58f

Comment 13 gaojianan 2020-02-26 07:56:06 UTC
Hi,
I can still reproduce this issue in libvirt version:
libvirt-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64

Step:
1.Prepare a nfs storage with nfsvers=4 and root_squash
10.72.12.151:/nfs on /mountpoint type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.66.36.15,local_lock=none,addr=10.72.12.151)


2.Start a guest with the disk image in the nfs mount dir:
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='directsync' io='native' discard='ignore' detect_zeroes='off'/>
      <source file='/mountpoint/RHEL-8.1-x86_64-latest.qcow2'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

]# virsh start pc
error: Failed to start domain pc
error: internal error: qemu unexpectedly closed the monitor: 2020-02-26T07:53:37.779928Z qemu-kvm: -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"ignore","detect-zeroes":"off","cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null}: Could not reopen file: Permission denied

[root@jgao-test1 ~]# virsh domblklist pc
 Target   Source
----------------------------------------------------
 vdb      /mountpoint/RHEL-8.1-x86_64-latest.qcow2

Guest still can't start with root_squash nfs .
Is there any wrong with my idea?

Comment 14 Peter Krempa 2020-02-26 09:18:25 UTC
This is a different problem now. The error is reported by qemu.

Can the qemu user actually access the '/mountpoint/RHEL-8.1-x86_64-latest.qcow2' file? The disk configuration explicitly disables unix permission relabelling so libvirt is NOT making sure that it can be accessed, so it must be accessible prior to attemptint the start.

Comment 15 gaojianan 2020-02-27 02:42:09 UTC
(In reply to Peter Krempa from comment #14)
> This is a different problem now. The error is reported by qemu.
> 
> Can the qemu user actually access the
> '/mountpoint/RHEL-8.1-x86_64-latest.qcow2' file? The disk configuration
> explicitly disables unix permission relabelling so libvirt is NOT making
> sure that it can be accessed, so it must be accessible prior to attemptint
> the start.

Thanks for your idea,and verified this bug as follow:
libvirt-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64

1.Prepare a nfs storage with nfsvers=4 and root_squash and make sure the file in the nfs server is accessable for the qemu user
10.72.12.151:/nfs on /mountpoint type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.66.36.15,local_lock=none,addr=10.72.12.151)


2.Start a guest with the disk image in the nfs mount dir:
<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='directsync' io='native' discard='ignore' detect_zeroes='off'/>
      <source file='/mountpoint/RHEL-8.1-x86_64-latest.qcow2'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

3.# virsh start pc
Domain pc started


Work as expected

Comment 16 Michal Privoznik 2020-02-27 13:36:25 UTC
*** Bug 1761299 has been marked as a duplicate of this bug. ***

Comment 18 errata-xmlrpc 2020-05-05 09:57:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017