1740506 – Migration fails with mounted storage system

Bug 1740506 - Migration fails with mounted storage system

Summary: Migration fails with mounted storage system

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	8.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	8.1
Assignee:	Michal Privoznik
QA Contact:	gaojianan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1652078
TreeView+	depends on / blocked

Reported:	2019-08-13 06:49 UTC by gaojianan
Modified:	2020-11-19 08:55 UTC (History)
CC List:	6 users (show)
Fixed In Version:	libvirt-5.6.0-4.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-11-06 07:18:29 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	knoel: mirror+

Attachments	(Terms of Use)
libvirtd.log (809.65 KB, text/plain) 2019-08-13 09:05 UTC, gaojianan	no flags	Details
target libvirtd.log (592.39 KB, text/plain) 2019-08-13 09:36 UTC, gaojianan	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:3723	0	None	None	None	2019-11-06 07:18:50 UTC

Description gaojianan 2019-08-13 06:49:13 UTC

Description of problem:
Migration fails with gluster storage domain

Version-Release number of selected component (if applicable):
libvirt-5.6.0-1.virtcov.el8.x86_64
qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Prepare an image in gluster storage and mount glusterfs on dst and src host
# mount -t glusterfs 10.66.4.119:/gvnew /mnt/gluster

2. Prepare a running VM whose image on the mount dir
# virsh domblklist demo
setlocale: No such file or directory
 Target   Source
--------------------------------------------------------
 hda      /mnt/gluster/test.qcow2

3.Start the domain and do migrate form 8.1 to 8.1
# virsh start demo                                                                                                                                                                             
Domain demo started 

# virsh migrate demo qemu+ssh://$ip/system --live --verbose --p2p --persistent                                                                                                       
error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib
/libvirt/images/glusterfs/test.qcow2 which is already in use

# getenforce 
Permissive

Actual results:# getenforce 
Permissive

As step 3,get error when migrate

Expected results:
Can migrate successful

Additional info:

Comment 1 Han Han 2019-08-13 08:20:33 UTC

Please upload libvirtd log with following log filters:
log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util 1:security"

Comment 2 gaojianan 2019-08-13 09:05:24 UTC

Created attachment 1603251 [details]
libvirtd.log

Libvirt log from start guest to migrate.

Comment 3 gaojianan 2019-08-13 09:36:38 UTC

Created attachment 1603256 [details]
target libvirtd.log

Comment 4 gaojianan 2019-08-19 09:39:56 UTC

The same question:
Migration fails with ceph storage domain which is mounted in local system

# mount -t ceph 10.73.224.204:6789:/  /mnt/cephfs/ -o name=admin,secret=AQCAgd5cFyMQLhAAHaz6w+WKy5LvmKjmRAViEg==

disk.xml:

    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/mnt/cephfs/qcow2.img'/>
      <target dev='hda' bus='ide'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>                                          
    </disk>

# virsh migrate demo qemu+ssh://10.16.56.165/system --live --verbose --p2p
setlocale: No such file or directory
error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /mnt/cephfs/qcow2.img which is already in use

Comment 5 Michal Privoznik 2019-08-22 12:44:32 UTC

Problem is that gluster (and I suspect ceph is the same) don't really support SELinux. They only emulate it locally. While they support XATTRs they don't really allow changing SELinux context. While it usually plays in our favor (e.g. migration is possible) with my recent work it plays against us. I mean, in case of dynamic SELinux labels (nearly every config I've seen, if not every single one), SELinux labels are generated at runtime by libvirt and they are random. So in general, the domain has different label on source than it has on the target. But since network filesystems don't really support SELinux (i.e. change in the label is not propagated to other hosts) - it is only locally emulated by kernel - migration works. The file shared on NFS/GlusterFS/Ceph/.. has two different labels (one from source POV the other from destination POV).
And here comes the problem with XATTRs - if a network FS supports XATTRs (GlusterFS and Ceph do) then XATTRs are set when domain is starting on the source and later, when destination tries to set its labels it finds mismatching record in XATTRs and migration is thus denied.
Looks like the best solution is to not store XATTRs if underlying FS does not support SELinux.

Comment 6 Michal Privoznik 2019-08-22 15:20:00 UTC

Patches posted upstream:

https://www.redhat.com/archives/libvir-list/2019-August/msg01041.html

Comment 7 Michal Privoznik 2019-08-30 11:30:53 UTC

Patches pushed upstream:

8fe953805a security_selinux: Play nicely with network FS that only emulates SELinux
eaa2a064fa security_selinux: Drop virSecuritySELinuxSetFileconHelper
b71d54f447 security_selinux: Drop @optional from _virSecuritySELinuxContextItem
079c1d6a29 security_selinux: Drop virSecuritySELinuxSetFileconOptional()
34712a5e3b virSecuritySELinuxSetFileconImpl: Drop @optional argument

v5.7.0-rc1-14-g8fe953805a

Comment 8 Michal Privoznik 2019-09-02 11:28:06 UTC

To POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2019-September/msg00025.html

Comment 10 gaojianan 2019-09-10 03:16:10 UTC

Verified on version:
libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64

Scenario 1:
Migrate with rbd backend:
1.
# mount -t ceph 10.73.224.204:6789:/  /mnt/cephfs/ -o name=admin,secret=AQCAgd5cFyMQLhAAHaz6w+WKy5LvmKjmRAViEg==  (both in source and target)
disk.xml:
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/mnt/cephfs/qcow2.img'/>
      <target dev='hda' bus='ide'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>                                          
    </disk>

2.
# virsh migrate demo qemu+ssh://10.16.56.54/system --live --verbose --p2p
Migration: [100 %]
And then migrate back:
virsh migrate demo qemu+ssh://10.16.56.144/system --verbose --live --p2p
Migration: [100 %]

Scenario 2:
Migrate with gluster backend
1.Prepare an image in gluster storage and mount glusterfs on dst and src host
# mount -t glusterfs 10.66.4.119:/gvnew /mnt/gluster

2. Prepare a running VM whose image on the mount dir
# virsh domblklist demo
 Target   Source
--------------------------------------------------------
 hda      /mnt/gluster/test.qcow2

3.Start the domain and do migrate form 8.1 to 8.1
# virsh start demo                                                                                                                                                                             
Domain demo started 

# virsh migrate demo qemu+ssh://10.16.56.144/system --live --verbose --p2p --persistent 
Migration: [100 %]

And then migrate back:
virsh migrate demo --live qemu+ssh://10.16.56.54/system --verbose  --p2p --persistent 
Migration: [100 %]

Scenario 3:
1.Migrate with symlink dir
# mount -t glusterfs 10.66.4.119:/gvnew /mnt/gluster
# ln -s /mnt/gluster/ /var/lib/libvirt/images/glusterfs
# virsh domblklist demo
 Target   Source
-----------------------------------------------------
 vdb      /var/lib/libvirt/images/glusterfs/A.qcow2

2.
# virsh migrate demo --live qemu+ssh://10.16.56.54/system --verbose 
root.56.54's password: 
Migration: [100 %]


Work as expected

Comment 12 errata-xmlrpc 2019-11-06 07:18:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723

Note You need to log in before you can comment on or make changes to this bug.