Bug 1740024
| Summary: | Refcount should not accumulate when the second vm using same disk fails to start | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Fangge Jin <fjin> | |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
| Status: | CLOSED ERRATA | QA Contact: | yafu <yafu> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 8.1 | CC: | chhu, david.abdurachmanov, hhan, jdenemar, jsuchane, lmen, mprivozn, mtessun, toneata, xuzhang, yafu | |
| Target Milestone: | rc | Keywords: | Regression, Upstream, ZStream | |
| Target Release: | 8.0 | Flags: | yafu:
needinfo-
knoel: mirror+ |
|
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | libvirt-5.6.0-9.el8 | Doc Type: | Bug Fix | |
| Doc Text: |
Cause:
When starting up a guest libvirt needs to prepare all the files that qemu is going to touch. This involves setting up security labels, i.e. DAC labels (aka user:group owners), and SELinux labels so that qemu can access desired files. At the same time, when 'remember_owner' is set in qemu.conf (or if commented out it's on by default), libvirt uses extended attributes (aka XATTRs) to record original labels, so that they can be restored back when qemu shuts down. This comes handy when the file is in user's $HOME (e.g. installation ISO) - because once qemu no longer uses it, the file has original owner/SELinux label and thus ordinary user doesn't lose access to it.
However, the XATTRs are a bit tricky - each security driver (= piece of code that handles only DAC, or only SELinux) uses its own namespace (so that they are independent) and also has their own reference counter (the reference counter reflects how many times given file is in use). This means that each file has two reference counters - one for DAC and one for SELinux. Ideally, these are in sync - the file has both labels after all.
Anyway, libvirt does perform a rollback, that is - a list of [path; desired DAC label] pairs is gathered and then set at once. The same applies for SELinux labels (this means we have two independent lists). If something goes wrong while processing the list, a rollback is performed => the original owner is restored. However, libvirt did not perform rollback on DAC list when applying SELinux list failed.
Another way to look at this (leaving out 'remember_owner' feature completely) is: when setting up labels DAC labels are set first (i.e. chown) and only after that SELinux labels are set (i.e. chcon). But, if setting up DAC labels succeeded and setting up SELinux failed, the files were left with DAC labels changed which creates a vulnerability.
Consequence:
If setting up SELinux labels failed at some point, the XATTR refcounter for DAC would increase (if remember_owner is enabled, which is default), and/or all files configured for the domain (disks, sockets, ...) are left with the owner that domain was configured to run under (if remember_owner is disabled).
Fix:
The fix consists of performing a rollback on DAC driver when SELinux driver fails.
Result:
The XATTR refcounter for DAC no longer increases and files keep their original owner.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1771500 (view as bug list) | Environment: | ||
| Last Closed: | 2020-02-04 18:28:48 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1652078, 1771500, 1780154 | |||
Also meet the same issue when trying to attach the same image to two guests. Reproduce steps: 1.Prepare two running guest; 2.Prepare a image: #qemu-img create -f qcow2 test.qcow2 100M 3.Attach the disk to one guest: #virsh attach-disk iommu2 /var/lib/libvirt/images/test.qcow2 sdb --subdriver qcow2 Disk attached successfully 4.Attach the disk to another guest: # virsh attach-disk vm2 /var/lib/libvirt/images/test.qcow2 sdc --subdriver qcow2 error: Failed to attach disk error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/test.qcow2 which is already in use 5.Detach the disk from guest one: #virsh detach-disk iommu2 sdb Disk detached successfully 6.Attach the disk to guest again: # virsh attach-disk vm2 /var/lib/libvirt/images/test.qcow2 sdc --subdriver qcow2 error: Failed to attach disk error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/test.qcow2 which is already in use Hi, Michal, Would you help to check this bug please? I guess it is caused by the patches for Bug 1652078 - RFE: the security drivers must remember original permissions/labels and restore them after. And this bug will cause the guest can not start any more easily. Thanks a lot. Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2019-August/msg01035.html Fixed upstream as: 6a2806fd54 security: Don't increase XATTRs refcounter on failure v5.6.0-298-g6a2806fd54 Fail to verify this bug with libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64, the dac ref still increased when start a guest failed: # virsh start vm1 Domain vm1 started # getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 trusted.libvirt.security.dac="+0:+0" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:virt_image_t:s0" trusted.libvirt.security.timestamp_dac="1567565944" trusted.libvirt.security.timestamp_selinux="1567565944" # virsh start vm2 error: Failed to start domain vm2 error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 which is already in use # getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 trusted.libvirt.security.dac="+0:+0" trusted.libvirt.security.ref_dac="2" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:virt_image_t:s0" trusted.libvirt.security.timestamp_dac="1567565944" trusted.libvirt.security.timestamp_selinux="1567565944" # virsh destroy vm1 Domain vm1 destroyed # getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 trusted.libvirt.security.dac="+0:+0" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.timestamp_dac="1567565944" # ll -Z /var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 -rw-r--r--. 1 qemu qemu system_u:object_r:virt_image_t:s0 769589248 Sep 5 03:47 /var/lib/libvirt/images/RHEL-8.1-x86_64-latest2.qcow2 Patches posted upstream: https://www.redhat.com/archives/libvir-list/2019-September/msg00610.html Reproduced on libvirt-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64 qemu-kvm-4.1.0-10.module+el8.1.0+4234+33aa4f57.x86_64. My issue is the same scenario as this. Patches pushed upstream: 9d03e9adf1 security_stack: Perform rollback if one of stacked drivers fails cd355a526f security_stack: Turn list of nested drivers into a doubly linked list 3f968a8706 security: Introduce virSecurityManagerGetDriver() 81dbceea65 security: Rename virSecurityManagerGetDriver() to virSecurityManagerGetVirtDriver() 458d0a8c52 security: Pass @migrated to virSecurityManagerSetAllLabel v5.8.0-134-g9d03e9adf1 Verified with libvirt-5.6.0-7.module+el8.1.1+4483+2f45aaa2.x86_64. Test steps: # virsh start vm1 Domain vm1 started # getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/78.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/libvirt/images/78.qcow2 trusted.libvirt.security.dac="+0:+0" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="unconfined_u:object_r:virt_image_t:s0" trusted.libvirt.security.timestamp_dac="1573809643" trusted.libvirt.security.timestamp_selinux="1573809643" # virsh start vm2 error: Failed to start domain vm2 error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/78.qcow2 which is already in use #getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/78.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/libvirt/images/78.qcow2 trusted.libvirt.security.dac="+0:+0" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="unconfined_u:object_r:virt_image_t:s0" trusted.libvirt.security.timestamp_dac="1573809643" trusted.libvirt.security.timestamp_selinux="1573809643" # virsh destroy vm1 Domain vm1 destroyed #getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/78.qcow2 no output #virsh start vm1 Domain vm1 started Also test disk hotplug, and other devices: such as: memory, console/serial device, input device, etc. Hi Michal, The xattr was cleared when trying to start the second vm using same disk with libvirt-5.6.0-9.x86_64. And it works well with libvirt-5.6.0-7.x86_64. Could you help to check the issue please? Thanks. Test steps: 1.# virsh start vm1 Domain vm1 started 2.# getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/test.qcow2 getfattr: Removing leading '/' from absolute path names # file: var/lib/libvirt/images/test.qcow2 trusted.libvirt.security.dac="+0:+0" trusted.libvirt.security.ref_dac="1" trusted.libvirt.security.ref_selinux="1" trusted.libvirt.security.selinux="system_u:object_r:virt_image_t:s0" trusted.libvirt.security.timestamp_dac="1574640796" trusted.libvirt.security.timestamp_selinux="1574640796" 3.Start another guest using /var/lib/libvirt/images/test.qcow2: # virsh start vm2 error: Failed to start domain vm2 error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/test.qcow2 which is already in use 4.The xattr was cleared after step3: # getfattr -m trusted.libvirt.security -d /var/lib/libvirt/images/test.qcow2 no output Yes, this is a bug in the patch I've pushed. Proposing the fix here: https://www.redhat.com/archives/libvir-list/2019-December/msg00246.html Actually, I've backported the patches needed as a part of bug 1741456. Moving back to ON_QA. Track the issue in the comment 22 in bug: https://bugzilla.redhat.com/show_bug.cgi?id=1780154 And according to comment 21-24, move this bug to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0404 |
Description of problem: Refcount should not accumulate when the second vm using same disk fails to start Version-Release number of selected component (if applicable): libvirt-5.6.0-1.virtcov.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1. Prepare 2 vm using same disk image: /var/lib/libvirt/images/RHEL-8.1.0-20190806.2-x86_64.qcow2 2. Start the first vm: # virsh start rhel7.6 Domain rhel7.6 started 3. Start the second vm(it fails to start, which is expected): # virsh start q35 error: Failed to start domain q35 error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/RHEL-8.1.0-20190806.2-x86_64.qcow2 which is already in use 4. Destroy first vm: # virsh destroy rhel7.6 Domain rhel7.6 destroyed 5. Check the selinux context of the disk image, it is not restored(which is not expected) # ll /var/lib/libvirt/images/RHEL-8.1.0-20190806.2-x86_64.qcow2 -Z -rw-r--r--. 1 qemu qemu system_u:object_r:svirt_image_t:s0:c140,c631 1314979840 Aug 12 02:06 /var/lib/libvirt/images/RHEL-8.1.0-20190806.2-x86_64.qcow2 6. Try to start the first vm, it fails to start(which is not expected) # virsh start rhel7.6 error: Failed to start domain rhel7.6 error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/images/RHEL-8.1.0-20190806.2-x86_64.qcow2 which is already in use Actual results: As step 5&&6 Expected results: Refcount of the disk image should not accumulate when the second vm fails to start. In step5, disk image context should be restored and vm can start successfully in step6. Additional info: After line 1396, refcount should be minus 1. 1326 static int 1327 virSecuritySELinuxSetFileconHelper(virSecurityManagerPtr mgr, 1328 const char *path, 1329 const char *tcon, 1330 bool optional, 1331 bool remember) 1332 { 1333 bool privileged = virSecurityManagerGetPrivileged(mgr); 1334 security_context_t econ = NULL; 1335 int refcount; 1336 int rc; 1337 int ret = -1; 1338 1339 if ((rc = virSecuritySELinuxTransactionAppend(path, tcon, 1340 optional, remember, false)) < 0) 1341 return -1; 1342 else if (rc > 0) 1343 return 0; 1344 1345 if (remember) { 1346 if (getfilecon_raw(path, &econ) < 0 && 1347 errno != ENOTSUP && errno != ENODATA) { 1348 virReportSystemError(errno, 1349 _("unable to get SELinux context of %s"), 1350 path); 1351 goto cleanup; 1352 } 1353 1354 if (econ) { 1355 refcount = virSecuritySELinuxRememberLabel(path, econ); 1356 if (refcount == -2) { 1357 /* Not supported. Don't error though. */ 1358 } else if (refcount < 0) { 1359 goto cleanup; 1360 } else if (refcount > 1) { 1361 /* Refcount is greater than 1 which means that there 1362 * is @refcount domains using the @path. Do not 1363 * change the label (as it would almost certainly 1364 * cause the other domains to lose access to the 1365 * @path). */ 1366 if (STRNEQ(econ, tcon)) { 1367 virReportError(VIR_ERR_OPERATION_INVALID, 1368 _("Setting different SELinux label on %s " 1369 "which is already in use"), path); 1370 goto cleanup; 1371 } 1372 } 1373 } 1374 }