Bug 1404952
Summary: | udev rewrites permissions set by libvirt on block devices that are closed | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jaroslav Suchanek <jsuchane> | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
Status: | CLOSED ERRATA | QA Contact: | yafu <yafu> | |
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | 7.4 | CC: | chorn, dyuan, jdenemar, maurizio.antillon, mprivozn, rbalakri, xuzhang, yafu | |
Target Milestone: | rc | Keywords: | Upstream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-3.0.0-1.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1354251 | |||
: | 1404990 1404992 (view as bug list) | Environment: | ||
Last Closed: | 2017-08-01 17:19:14 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1404990, 1404992, 1446211 |
Description
Jaroslav Suchanek
2016-12-15 08:21:25 UTC
Moving to POST: commit f444faa94a0e30f7dfdd47dce18b526abb0aaa9f Author: Michal Privoznik <mprivozn> AuthorDate: Tue Dec 6 17:35:12 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Enable mount namespace https://bugzilla.redhat.com/show_bug.cgi?id=1404952 Signed-off-by: Michal Privoznik <mprivozn> commit 661887f558208074169b0d3340c457108b6a023d Author: Michal Privoznik <mprivozn> AuthorDate: Fri Nov 18 16:34:45 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Let users opt-out from containerization Given how intrusive previous patches are, it might happen that there's a bug or imperfection. Lets give users a way out: if they set 'namespaces' to an empty array in qemu.conf the feature is suppressed. Signed-off-by: Michal Privoznik <mprivozn> commit f95c5c48d416dcdab2f3ee7718f12a02833c9339 Author: Michal Privoznik <mprivozn> AuthorDate: Fri Nov 18 15:19:12 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Manage /dev entry on RNG hotplug When attaching a device to a domain that's using separate mount namespace we must maintain /dev entries in order for qemu process to see them. Signed-off-by: Michal Privoznik <mprivozn> commit f5fdf23a68d5c9838890451c1c50b4ae1062d8d2 Author: Michal Privoznik <mprivozn> AuthorDate: Fri Nov 18 14:53:27 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Manage /dev entry on chardev hotplug When attaching a device to a domain that's using separate mount namespace we must maintain /dev entries in order for qemu process to see them. Signed-off-by: Michal Privoznik <mprivozn> commit 6e57492839c0f644ade6b4174993f33f72b66ba3 Author: Michal Privoznik <mprivozn> AuthorDate: Wed Nov 16 15:27:47 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Manage /dev entry on hostdev hotplug When attaching a device to a domain that's using separate mount namespace we must maintain /dev entries in order for qemu process to see them. Signed-off-by: Michal Privoznik <mprivozn> commit 81df21507bef94ae53a056156e4aa6661f29237a Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 16:53:04 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Manage /dev entry on disk hotplug When attaching a device to a domain that's using separate mount namespace we must maintain /dev entries in order for qemu process to see them. Signed-off-by: Michal Privoznik <mprivozn> commit eadaa975480d10eb057eb72bf833888e88e948e8 Author: Michal Privoznik <mprivozn> AuthorDate: Wed Nov 23 11:52:57 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Enter the namespace on relabelling Instead of trying to fix our security drivers, we can use a simple trick to relabel paths in both namespace and the host. I mean, if we enter the namespace some paths are still shared with the host so any change done to them is visible from the host too. Therefore, we can just enter the namespace and call SetAllLabel()/RestoreAllLabel() from there. Yes, it has slight overhead because we have to fork in order to enter the namespace. But on the other hand, no complexity is added to our code. Signed-off-by: Michal Privoznik <mprivozn> commit 2160f338a74543634e26aeddef1e4c63184660da Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 16:10:23 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Prepare RNGs when starting a domain When starting a domain and separate mount namespace is used, we have to create all the /dev entries that are configured for the domain. Signed-off-by: Michal Privoznik <mprivozn> commit 8ec8a8c5ffa0a4b662013313116b4f166bfe989e Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 16:03:02 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Prepare inputs when starting a domain When starting a domain and separate mount namespace is used, we have to create all the /dev entries that are configured for the domain. Signed-off-by: Michal Privoznik <mprivozn> commit 2c654490f355c8d8e7ad4748952008391299b411 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 15:25:15 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Prepare TPM when starting a domain When starting a domain and separate mount namespace is used, we have to create all the /dev entries that are configured for the domain. Signed-off-by: Michal Privoznik <mprivozn> commit 4e4451019cb2e6dea355e93e946e0169023753c6 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 15:17:05 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Prepare chardevs when starting a domain When starting a domain and separate mount namespace is used, we have to create all the /dev entries that are configured for the domain. Signed-off-by: Michal Privoznik <mprivozn> commit 73267cec46e08e74f6297c44b8f47c68180b3712 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 14:37:52 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Prepare hostdevs when starting a domain When starting a domain and separate mount namespace is used, we have to create all the /dev entries that are configured for the domain. Signed-off-by: Michal Privoznik <mprivozn> commit 054202d02062c313e01e6c8b0084b91a738d13aa Author: Michal Privoznik <mprivozn> AuthorDate: Mon Nov 14 17:36:45 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Prepare disks when starting a domain When starting a domain and separate mount namespace is used, we have to create all the /dev entries that are configured for the domain. Signed-off-by: Michal Privoznik <mprivozn> commit bb4e529664a6e1ef08030aefc96f21f14eba2aea Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 11:30:18 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu: Spawn qemu under mount namespace Prime time. When it comes to spawning qemu process and relabelling all the devices it's going to touch, there's inherent race with other applications in the system (e.g. udev). Instead of trying convincing udev to not touch libvirt managed devices, we can create a separate mount namespace for the qemu, and mount our own /dev there. Of course this puts more work onto us as we have to maintain /dev files on each domain start and device hot(un-)plug. On the other hand, this enhances security also. From technical POV, on domain startup process the parent (libvirtd) creates: /var/lib/libvirt/qemu/$domain.dev /var/lib/libvirt/qemu/$domain.devpts The child (which is going to be qemu eventually) calls unshare() to create new mount namespace. From now on anything that child does is invisible to the parent. Child then mounts tmpfs on $domain.dev (so that it still sees original /dev from the host) and creates some devices (as explained in one of the previous patches). The devices have to be created exactly as they are in the host (including perms, seclabels, ACLs, ...). After that it moves $domain.dev mount to /dev. What's the $domain.devpts mount there for then you ask? QEMU can create PTYs for some chardevs. And historically we exposed the host ends in our domain XML allowing users to connect to them. Therefore we must preserve devpts mount to be shared with the host's one. To make this patch as small as possible, creating of devices configured for domain in question is implemented in next patches. Signed-off-by: Michal Privoznik <mprivozn> commit a5896e8ca404a2e975808728328e44efd49a7960 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 15 11:28:51 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 qemu_cgroup: Expose defaultDeviceACL This is a list of devices that qemu needs for its run (apart from what's configured for domain). The devices on the list are enabled in the CGroups by default so they will be good candidates for initial /dev for new qemu. Signed-off-by: Michal Privoznik <mprivozn> commit 5ac52bd0fe80d1741071250f485ae54375508e48 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Dec 6 16:06:02 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 virscsivhost: Introduce virSCSIVHostDeviceGetPath We will need this function in near future so that we know what /dev device corresponds to the SCSI device. Signed-off-by: Michal Privoznik <mprivozn> commit 6bcacd55e537c0fc3b793949637197e82a9dffcb Author: Michal Privoznik <mprivozn> AuthorDate: Wed Nov 16 15:27:20 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 virscsi: Introduce virSCSIDeviceGetPath We will need this function in near future so that we know what /dev device corresponds to the SCSI device. Signed-off-by: Michal Privoznik <mprivozn> commit c4237d8e0ca090c7db76e5a226efa0ed2305835d Author: Michal Privoznik <mprivozn> AuthorDate: Wed Nov 16 15:26:59 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 virusb: Introduce virUSBDeviceGetPath We will need this function in near future so that we know what /dev device corresponds to the USB device. Signed-off-by: Michal Privoznik <mprivozn> commit 654b4d48bcdeeaf31df131644544bb1277f0f8bb Author: Michal Privoznik <mprivozn> AuthorDate: Tue Nov 22 11:14:08 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 virfile: Introduce ACL helpers Namely, virFileGetACLs, virFileSetACLs, virFileFreeACLs and virFileCopyACLs. These functions are going to be required when we are creating /dev for qemu. We have copy anything that's in host's /dev exactly as is. Including ACLs. Signed-off-by: Michal Privoznik <mprivozn> commit 1a7c9a5d5087d562fabfd7b7ff3cd1b9b19e9419 Author: Michal Privoznik <mprivozn> AuthorDate: Thu Nov 10 16:17:48 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 virfile: Introduce virFileSetupDev This part of code that LXC currently uses will be reused so move to a generic function. Signed-off-by: Michal Privoznik <mprivozn> commit 48a12d3b2554cc7a4255ef9ff8564c0a3ef7c1b3 Author: Michal Privoznik <mprivozn> AuthorDate: Thu Nov 10 14:55:48 2016 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Dec 15 09:25:16 2016 +0100 virprocess: Introduce virProcessSetupPrivateMountNS This part of code that LXC currently uses will be reused so move to a generic function. Signed-off-by: Michal Privoznik <mprivozn> v2.5.0-121-gf444faa94a *** Bug 1401575 has been marked as a duplicate of this bug. *** Hi,Michal, I tried to verify this bug and found a issue. If there is a running guest, When I disable creating namespace for qemu process in the qemu.conf and restart libvirtd, then destroy the guest and start the guest again. Libvirt still creates the namespace for the guest. Would you help to check the issue please? Thanks a lot. Steps to reproduce: 1.Start a guest: #virsh start rhel7.3-min 2.Disable the namespace in qemu.conf #vim /etc/libvirt/qemu.conf namespaces = [ ] 3.Restart libvirtd service: #systemctl restart libvirtd 4.Destroy and start the guest again: #virsh destroy rhel7.3-min 5.Start the guest again: #virsh start rhel7.3-min 6.libvirt still create namespace for the guest: #lsns | grep qemu lsns | grep -i qemu 4026532522 mnt 1 31993 qemu /usr/libexec/qemu-kvm -name guest=rhel7.3-min ... 7.Need to restart libvirtd service again after step 4 to not create namespace for the guest. Yes(In reply to yafu from comment #5) > Hi,Michal, > Oh yes. It is a bug. However, given how big the feature is I think it can be tracked in a separate bug instead of this one. (In reply to Michal Privoznik from comment #6) > Yes(In reply to yafu from comment #5) > > Hi,Michal, > > > > Oh yes. It is a bug. However, given how big the feature is I think it can be > tracked in a separate bug instead of this one. Thanks Michal. File a separate bug to track the issue in comment 5: https://bugzilla.redhat.com/show_bug.cgi?id=1453142 Reproduced with: qemu-kvm-rhev-0.12.1.2-2.491.el6_8.3.x86_64 libvirt-0.10.2-60.el6.x86_64 udev-147-2.72.el6.x86_64 Test steps are as https://bugzilla.redhat.com/show_bug.cgi?id=1354690#c41. Verify with: qemu-kvm-rhev-2.9.0-12.el7.x86_64 libvirt-3.2.0-14.el7.x86_64 systemd-219-41.el7.x86_64 Test steps: 1.Start a guest with block device: <disk type='block' device='lun' sgio='unfiltered'> <driver name='qemu' type='raw' cache='none'/> <source dev='/dev/disk/by-path/ip-10.66.71.72:3260-iscsi-iqn.2016-03.com.virttest:logical-pool.target-lun-0'/> <target dev='vdb' bus='scsi'/> <shareable/> <alias name='scsi0-0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> 2.Check the dac and selinux context of the block device after guest start: # ll /dev/disk/by-path/ip-10.66.71.72:3260-iscsi-iqn.2016-03.com.virttest:logical-pool.target-lun-0 lrwxrwxrwx. 1 root root 9 Jun 22 11:07 /dev/disk/by-path/ip-10.66.71.72:3260-iscsi-iqn.2016-03.com.virttest:logical-pool.target-lun-0 -> ../../sdb # ll -Z /dev/sdb brw-rw----. root disk system_u:object_r:fixed_disk_device_t:s0 /dev/sdb 3.Check the dac and selinux context in the qemu process namespace: #lsns | grep -i qemu # lsns | grep -i qemu 4026532392 mnt 1 15678 qemu /usr/libexec/qemu-kvm -name guest=rhel7.3-min,debug-threads=on ... #nsenter -t 15678 -m ## ll -Z /dev/sdb brw-rw----. qemu qemu system_u:object_r:svirt_image_t:s0 /dev/sdb According to the test results of step 2 and step 3, could see libvirt does not change the dac/selinux context of the block device on the host os when starting a guest, it changes the dac and selinux context in the qemu process's namespace insteadly. So the dac/selinux context changing during guest migration will have no effect on the guest starting in the target host. Verify the bug about devices used in the guest when enable namespaces with: qemu-kvm-rhev-2.9.0-12.el7.x86_64 libvirt-3.2.0-14.el7.x86_64 systemd-219-41.el7.x86_64 Test steps: 1.Start a guest with iscsi disk: #virsh dumpxml rhel7.3-min ... <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/sdb'/> <backingStore/> <target dev='vdb' bus='virtio'/> <alias name='virtio-disk1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> </disk> ... 2.Check the dac/selinux context in both host os and qemu namespaces: (1)In the host os: # ll -Z /dev/sdb brw-rw----. root disk system_u:object_r:fixed_disk_device_t:s0 /dev/sdb (2)In the qemu process namespace: #nsenter -t 15678 -m ## ll -Z /dev/sdb brw-rw----. qemu qemu system_u:object_r:svirt_image_t:s0 /dev/sdb 2.Hotunplug the iscsi disk: #virsh detach-disk rhel7.3 vdb Disk detached successfully 3.Check the dac/selinux context in both host os and qemu namespaces: (1)In the host os: # ll -Z /dev/sdb brw-rw----. root disk system_u:object_r:fixed_disk_device_t:s0 /dev/sdb (2)In the qemu process namespace: #nsenter -t 15678 -m #ll -Z /dev/sdb brw-rw----. root root system_u:object_r:fixed_disk_device_t:s0 /dev/sdb 4.Hotplug the iscsi disk: #virsh attach-disk rhel7.3-min /dev/sdb vdb Disk attached successfully 5.Check the dac/selinux context in both host os and qemu namespaces: (1)In the host os: # ll -Z /dev/sdb brw-rw----. root disk system_u:object_r:fixed_disk_device_t:s0 /dev/sdb (2)In the qemu process namespace: #nsenter -t 15678 -m ## ll -Z /dev/sdb brw-rw----. qemu qemu system_u:object_r:svirt_image_t:s0 /dev/sdb 6.Check the dac/selinux context in the host os after guest destroy: #virsh destroy rhel7.3-min # ll -Z /dev/sdb brw-rw----. root root system_u:object_r:fixed_disk_device_t:s0 /dev/sdb Also test with the rng/chardev/hostdev/input devices, libvirt can manage the source file/dev in the qemu process's namespaces correctly. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 |