Bug 1672178

Summary: Unable to migrate VM's using ceph storage - Unsafe migration: Migration without shared storage is unsafe [rhel-7.6.z]
Product: Red Hat Enterprise Linux 7 Reporter: RAD team bot copy to z-stream <autobot-eus-copy>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Han Han <hhan>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: amashah, areis, danken, ebenahar, hannsj_uhl, jdenemar, jhakimra, mtessun, nsoffer, tnisan, yalzhang
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-10.el7_6.5 Doc Type: Bug Fix
Doc Text:
Cause: When migrating a domain libvirt does all sort of checks to make sure that the domain will be able to run on destination. One of such checks is if domain disks are either stored on a network attached storage (assuming that the destination will have access) or storage migration is enabled and proper cache mode is selected. Well, libvirt did not detect CEPH as network attached storage which resulted in denied migration. Consequence: Migration was unsucsessfull. Fix: The fix consists of making libvirt detect CEPH as network attached storage so that no other requirements have to met for such disk. Result: Migration works again.
Story Points: ---
Clone Of: 1665553 Environment:
Last Closed: 2019-03-13 18:47:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1558836, 1665553, 1685799, 1753535    
Bug Blocks:    
Attachments:
Description Flags
The libvirtd log of starting vm failed none

Description RAD team bot copy to z-stream 2019-02-04 07:40:40 UTC
This bug has been copied from bug #1665553 and has been proposed to be backported to 7.6 z-stream (EUS).

Comment 5 Han Han 2019-02-22 09:45:00 UTC
Hi Martin,
I don't find any utils to mount cephfs in ceph-common-10.2.5-4.el7.x86_64 of RHEL7.6 client compose repo.

So, how does rhv mount cephfs? What is the ceph-common version you used in RHV?
And BTW, does rhv use ceph-fuse in rhv?

Comment 6 Han Han 2019-02-22 10:00:29 UTC
Hi, Joe,
Does RHOS nova provide ceph fuse for vm migration? Is there any development plan for ceph fuse in nova?

Since the bug fix here only supports vm migration based on cephfs, I wonder if we have requirement for ceph fuse or not.

Comment 7 Martin Tessun 2019-02-22 10:54:37 UTC
(In reply to Han Han from comment #5)
> Hi Martin,
> I don't find any utils to mount cephfs in ceph-common-10.2.5-4.el7.x86_64 of
> RHEL7.6 client compose repo.
> 

Ademar?

> So, how does rhv mount cephfs? What is the ceph-common version you used in
> RHV?
> And BTW, does rhv use ceph-fuse in rhv?

RHV does not mount ceph directly, but they are using cinderlib for that usecase.
@Tal: Can you elaborate a bit here.

Comment 8 Ademar Reis 2019-02-25 16:51:52 UTC
(In reply to Martin Tessun from comment #7)
> (In reply to Han Han from comment #5)
> > Hi Martin,
> > I don't find any utils to mount cephfs in ceph-common-10.2.5-4.el7.x86_64 of
> > RHEL7.6 client compose repo.
> > 
> 
> Ademar?

It should be "mount -t cephfs ...", but I don't have a rhel-7.6 environment handy right to to help further, sorry.

> 
> > So, how does rhv mount cephfs? What is the ceph-common version you used in
> > RHV?
> > And BTW, does rhv use ceph-fuse in rhv?
> 
> RHV does not mount ceph directly, but they are using cinderlib for that
> usecase.
> @Tal: Can you elaborate a bit here.

There are several details in the original BZ, opened by a customer: Bug 1665553

Comment 10 Han Han 2019-02-27 03:21:13 UTC
Hi Dan,
I found libvirt cannot start vm with disk on cephfs due to selinux issue:
# mount -t ceph bootp-73-75-128.lab.eng.pek2.redhat.com:/ /rhev -o name=admin,secretfile=ceph.key

# virsh domblklist cephfs
Target     Source
------------------------------------------------
vda        /rhev/cephfs.qcow2

# virsh start cephfs
error: Failed to start domain cephfs
error: internal error: qemu unexpectedly closed the monitor: 2019-02-27T03:12:09.620522Z qemu-kvm: -drive file=/rhev/cephfs.qcow2,format=qcow2,if=none,id=drive-virtio-disk0: Could not open '/rhev/cephfs.qcow2': Permission denied

The permission deny is from selinux:

# ausearch -m AVC,USER_AVC -ts recent
----
time->Tue Feb 26 22:12:09 2019
type=PROCTITLE msg=audit(1551237129.618:432): proctitle=2F7573722F6C6962657865632F71656D752D6B766D002D6E616D650067756573743D6365706866732C64656275672D746872656164733D6F6E002D53002D6F626A656374007365637265742C69643D6D61737465724B6579302C666F726D61743D7261772C66696C653D2F7661722F6C69622F6C6962766972742F71656D752F
type=SYSCALL msg=audit(1551237129.618:432): arch=c000003e syscall=2 success=no exit=-13 a0=55f37ecabcc0 a1=80800 a2=0 a3=7fa9f214b010 items=0 ppid=1 pid=82091 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c800,c819 key=(null)
type=AVC msg=audit(1551237129.618:432): avc:  denied  { read } for  pid=82091 comm="qemu-kvm" name="cephfs.qcow2" dev="ceph" ino=1099511627776 scontext=system_u:system_r:svirt_t:s0:c800,c819 tcontext=system_u:object_r:cephfs_t:s0 tclass=file permissive=0
----
time->Tue Feb 26 22:12:09 2019
type=PROCTITLE msg=audit(1551237129.618:433): proctitle=2F7573722F6C6962657865632F71656D752D6B766D002D6E616D650067756573743D6365706866732C64656275672D746872656164733D6F6E002D53002D6F626A656374007365637265742C69643D6D61737465724B6579302C666F726D61743D7261772C66696C653D2F7661722F6C69622F6C6962766972742F71656D752F
type=SYSCALL msg=audit(1551237129.618:433): arch=c000003e syscall=4 success=no exit=-13 a0=55f37ecabcc0 a1=7ffd49632b10 a2=7ffd49632b10 a3=7fa9f214b010 items=0 ppid=1 pid=82091 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c800,c819 key=(null)
type=AVC msg=audit(1551237129.618:433): avc:  denied  { getattr } for  pid=82091 comm="qemu-kvm" path="/rhev/cephfs.qcow2" dev="ceph" ino=1099511627776 scontext=system_u:system_r:svirt_t:s0:c800,c819 tcontext=system_u:object_r:cephfs_t:s0 tclass=file permissive=0
----
time->Tue Feb 26 22:12:09 2019
type=PROCTITLE msg=audit(1551237129.618:434): proctitle=2F7573722F6C6962657865632F71656D752D6B766D002D6E616D650067756573743D6365706866732C64656275672D746872656164733D6F6E002D53002D6F626A656374007365637265742C69643D6D61737465724B6579302C666F726D61743D7261772C66696C653D2F7661722F6C69622F6C6962766972742F71656D752F
type=SYSCALL msg=audit(1551237129.618:434): arch=c000003e syscall=2 success=no exit=-13 a0=55f37ecabdc0 a1=80002 a2=0 a3=7fa9f214a6d0 items=0 ppid=1 pid=82091 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c800,c819 key=(null)
type=AVC msg=audit(1551237129.618:434): avc:  denied  { read write } for  pid=82091 comm="qemu-kvm" name="cephfs.qcow2" dev="ceph" ino=1099511627776 scontext=system_u:system_r:svirt_t:s0:c800,c819 tcontext=system_u:object_r:cephfs_t:s0 tclass=file permissive=0

Version:
libvirt-4.5.0-10.el7_6.6.x86_64
selinux-policy-3.13.1-229.el7_6.9.noarch
qemu-kvm-rhev-2.12.0-19.el7.bz1597621.2.x86_64

Dan, as I know vm can be started with cephfs disk in rhv, how do you resolve this issue in rhv?

Comment 11 Dan Kenigsberg 2019-02-27 07:27:29 UTC
> Dan, as I know vm can be started with cephfs disk in rhv, how do you resolve this issue in rhv?

I believe that we supported ceph storage only via the iSCSI front-end of ceph. We plan to add native cephfs support via cinderlib, but Nir should be able to provide more information about where it stands.

Comment 12 Michal Privoznik 2019-02-27 10:30:05 UTC
(In reply to Han Han from comment #10)
> Hi Dan,
> I found libvirt cannot start vm with disk on cephfs due to selinux issue:
> # mount -t ceph bootp-73-75-128.lab.eng.pek2.redhat.com:/ /rhev -o
> name=admin,secretfile=ceph.key
> 
> # virsh domblklist cephfs
> Target     Source
> ------------------------------------------------
> vda        /rhev/cephfs.qcow2
> 
> # virsh start cephfs
> error: Failed to start domain cephfs
> error: internal error: qemu unexpectedly closed the monitor:
> 2019-02-27T03:12:09.620522Z qemu-kvm: -drive
> file=/rhev/cephfs.qcow2,format=qcow2,if=none,id=drive-virtio-disk0: Could
> not open '/rhev/cephfs.qcow2': Permission denied
> 

Isn't there a warning in logs? When it comes to selinux, libvirt always tries to set the label, but if the disk is on shared FS (which are known to usually not support XATTRs/SELinux labels) then it prints out a warning after setting the label failed.

Comment 13 Han Han 2019-02-28 10:00:51 UTC
Verified on:
libvirt-4.5.0-10.virtcov.el7_6.6.x86_64
qemu-kvm-rhev-2.12.0-18.el7_6.3.x86_64

Since the issue of comment10. I verified its function on selinux permissive mode first.

Setup:
1. Prepare hostname and migration ports for the two hosts
2. Prepare a ceph cluster with cephfs:
https://www.howtoforge.com/tutorial/how-to-mount-cephfs-on-centos-7/
3. Mount cephfs and create the same symbol links on two hosts:
# mount -t ceph xxxx:/ /mnt -o name=admin,secretfile=ceph.key

# ln -s /mnt /rhev

Copy the vm image to /rhev


Steps:
1. Start vm with disk on shareable cephfs
# virsh domblklist cephfs
Target     Source
------------------------------------------------
vda        /rhev/cephfs.qcow2

Migrate it:
# virsh migrate cephfs qemu+ssh://lab.test/system --verbose
root's password: 
Migration: [100 %]

Migrate back:
# virsh migrate cephfs qemu+ssh://xxxx/system --verbose
root@xxxx's password: 
Migration: [100 %]

Comment 14 Han Han 2019-03-01 02:46:54 UTC
(In reply to Michal Privoznik from comment #12)
> (In reply to Han Han from comment #10)
> > Hi Dan,
> > I found libvirt cannot start vm with disk on cephfs due to selinux issue:
> > # mount -t ceph bootp-73-75-128.lab.eng.pek2.redhat.com:/ /rhev -o
> > name=admin,secretfile=ceph.key
> > 
> > # virsh domblklist cephfs
> > Target     Source
> > ------------------------------------------------
> > vda        /rhev/cephfs.qcow2
> > 
> > # virsh start cephfs
> > error: Failed to start domain cephfs
> > error: internal error: qemu unexpectedly closed the monitor:
> > 2019-02-27T03:12:09.620522Z qemu-kvm: -drive
> > file=/rhev/cephfs.qcow2,format=qcow2,if=none,id=drive-virtio-disk0: Could
> > not open '/rhev/cephfs.qcow2': Permission denied
> > 
> 
> Isn't there a warning in logs? When it comes to selinux, libvirt always
> tries to set the label, but if the disk is on shared FS (which are known to
> usually not support XATTRs/SELinux labels) then it prints out a warning
> after setting the label failed.

For mounted cephfs, I didn't find the file selinux labels:
# ls /rhev/ -alZ
drwxr-xr-x  root root ?                                .
dr-xr-xr-x. root root system_u:object_r:root_t:s0      ..
-rw-r--r--  root root ?                                cephfs.qcow2
lrwxrwxrwx  root root ?                                rhev -> /rhev/

In libvirtd log I find the info that set selinux label fail:
2019-03-01 02:42:54.283+0000: 52361: debug : virSecuritySELinuxRestoreAllLabel:2414 : Restoring security label on cephfs
2019-03-01 02:42:54.283+0000: 52361: info : virSecuritySELinuxRestoreFileLabel:1297 : Restoring SELinux context on '/rhev/cephfs.qcow2'
2019-03-01 02:42:54.283+0000: 52361: info : virSecuritySELinuxSetFileconHelper:1156 : Setting SELinux context on '/rhev/cephfs.qcow2' to 'system_u:object_r:default_t:s0'
2019-03-01 02:42:54.284+0000: 52361: debug : virFileIsSharedFSType:3667 : Check if path /rhev/cephfs.qcow2 with FS magic 12805120 is shared
2019-03-01 02:42:54.284+0000: 52361: info : virSecuritySELinuxSetFileconHelper:1200 : Setting security context 'system_u:object_r:default_t:s0' on '/rhev/cephfs.qcow2' not supported
2019-03-01 02:42:54.284+0000: 52361: debug : virSecurityDACRestoreAllLabel:1565 : Restoring security label on cephfs migrated=0
2019-03-01 02:42:54.284+0000: 52361: info : virSecurityDACRestoreFileLabelInternal:665 : Restoring DAC user and group on '/rhev/cephfs.qcow2'
2019-03-01 02:42:54.284+0000: 52361: info : virSecurityDACSetOwnershipInternal:567 : Setting DAC user and group on '/rhev/cephfs.qcow2' to '0:0'
2019-03-01 02:42:54.284+0000: 52361: debug : virSystemdTerminateMachine:429 : Attempting to terminate machine via systemd

Comment 15 Han Han 2019-03-01 02:49:11 UTC
Created attachment 1539721 [details]
The libvirtd log of starting vm failed

Comment 16 Michal Privoznik 2019-03-05 10:40:43 UTC
Libvirt tries to set SELinux label and even restore it. But if cephfs doesn't support SELinux label (but then I do not understand why SELinux is denying read and write) then I guess we have two options:
1) contact SELinux guys and ask them to fix this (obviously, this is a bug in their policy),
2) start domain with relabel='no' set on the cephfs disk (but this will still require some SELinux policy adjustements)

Comment 17 Han Han 2019-03-06 05:25:02 UTC
(In reply to Michal Privoznik from comment #16)
> Libvirt tries to set SELinux label and even restore it. But if cephfs
> doesn't support SELinux label (but then I do not understand why SELinux is
> denying read and write) then I guess we have two options:
> 1) contact SELinux guys and ask them to fix this (obviously, this is a bug
> in their policy),
OK. I will fill a bug
> 2) start domain with relabel='no' set on the cephfs disk (but this will
> still require some SELinux policy adjustements)
I viewed the domain xml in customer report. The selinux label element is added in xml:
  <seclabel type='dynamic' model='selinux' relabel='yes'/>
  <seclabel type='dynamic' model='dac' relabel='yes'/>

I start a cephfs disk vm with above, but it doesn't works, either.
# virsh start cephfs                                                                                                                            
error: Failed to start domain cephfs
error: internal error: qemu unexpectedly closed the monitor: 2019-03-06T03:39:47.739785Z qemu-kvm: -drive file=/rhev/cephfs.qcow2,format=qcow2,if=none,id=drive-virtio-disk0: Could not open '/rhev/cephfs.qcow2': Permission denie

And I tried to run qemu with svirt label directly:
# runcon -t svirt_t -u system_u -r system_r -l s0 /usr/libexec/qemu-kvm /rhev/cephfs.qcow2 
qemu-kvm: Could not open '/rhev/cephfs.qcow2': Permission denied

Comment 18 Han Han 2019-03-06 05:37:37 UTC
Hi, Amar
Could you please check the selinux in customer env is enforcing or permissive?
I found vm based on cephfs cannot start when selinux is enforcing.(see comment10) 
However, the vm can start in customer env. I wonder why customer didn't hit that issue

Comment 20 Han Han 2019-03-07 02:16:25 UTC
The selinux-policy blocker: https://bugzilla.redhat.com/show_bug.cgi?id=1558836

Comment 23 errata-xmlrpc 2019-03-13 18:47:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0521