Description of problem: Hotplugging a disk fails due to -EPERM. Version-Release number of selected component (if applicable): systemd-219-42.el7_4.4.x86_64 libvirt-daemon-kvm-3.2.0-14.el7_4.5.x86_64 vdsm-4.19.42-1.el7ev.x86_64 kernel-3.10.0-693.11.1.el7.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.11.x86_64 Steps to Reproduce: 1. Start guest with 4 Disks (A,B,C,D block storage) 2. Hotunplug B, C and D 3. Hotplug B. Actual results: Hotplug fails, qemu-kvm open(2) gets EPERM. Expected results: Hotplug works. Additional info: # cat qemu_open.stp probe kernel.function("do_sys_open").return { if (pid() != 2632) next if ($return != -1) next printf("do_sys_open returned -1, performed by %s, pid %d\n", execname(), pid()) printf(" %s\n", @entry($$parms)) } probe kernel.function("__devcgroup_check_permission").return { if (pid() != 2632) next if ($return != -1) next printf("__devcgroup_check_permission returned -1, performed by %s, pid %d\n", execname(), pid()) printf(" %s\n", @entry($$parms)) } During failure, we have this: # stap qemu_open.stp __devcgroup_check_permission returned -1, performed by qemu-kvm, pid 2632 type=0x1 major=0xfc minor=0x9 access=0x2 do_sys_open returned -1, performed by qemu-kvm, pid 2632 dfd=0xffffffffffffff9c filename=0x56354c7dfa20 flags=0x88800 mode=0x0 __devcgroup_check_permission returned -1, performed by qemu-kvm, pid 2632 type=0x1 major=0xfc minor=0x9 access=0x6 do_sys_open returned -1, performed by qemu-kvm, pid 2632 dfd=0xffffffffffffff9c filename=0x56354c7dfa20 flags=0x8c002 mode=0x0 So __devcgroup_check_permission generates the EPERM, which propagates all the way back to sys_do_open, and qemu-kvm gets this: # strace -p 2632 -e trace=open strace: Process 2632 attached open("/rhev/data-center/00000002-0002-0002-0002-000000000010/18579785-4d47-4237-9f53-3a2344e0e475/images/21d315fe-4516-4219-b484-87f716f6c2e2/673f331c-a599-4d43-bd24-3355042b41b8", O_RDONLY|O_NONBLOCK|O_CLOEXEC) = -1 EPERM (Operation not permitted) open("/rhev/data-center/00000002-0002-0002-0002-000000000010/18579785-4d47-4237-9f53-3a2344e0e475/images/21d315fe-4516-4219-b484-87f716f6c2e2/673f331c-a599-4d43-bd24-3355042b41b8", O_RDWR|O_DIRECT|O_CLOEXEC) = -1 EPERM (Operation not permitted) Here an idea on what happens within the kernel at that point: qemu-kvm-2632 [000] 816.223260: funcgraph_entry: | may_open() { qemu-kvm-2632 [000] 816.223260: funcgraph_entry: | inode_permission() { qemu-kvm-2632 [000] 816.223261: funcgraph_entry: | __inode_permission() { qemu-kvm-2632 [000] 816.223261: funcgraph_entry: | generic_permission() { qemu-kvm-2632 [000] 816.223261: funcgraph_entry: 0.038 us | get_acl(); qemu-kvm-2632 [000] 816.223261: funcgraph_entry: 0.043 us | in_group_p(); qemu-kvm-2632 [000] 816.223261: funcgraph_exit: 0.557 us | } qemu-kvm-2632 [000] 816.223262: funcgraph_entry: | __devcgroup_inode_permission() { qemu-kvm-2632 [000] 816.223262: funcgraph_entry: | __devcgroup_check_permission() { qemu-kvm-2632 [000] 816.223262: funcgraph_entry: 0.317 us | match_exception(); qemu-kvm-2632 [000] 816.223262: funcgraph_exit: 0.597 us | } qemu-kvm-2632 [000] 816.223263: funcgraph_exit: 0.856 us | } qemu-kvm-2632 [000] 816.223263: funcgraph_exit: 2.018 us | } qemu-kvm-2632 [000] 816.223263: funcgraph_exit: 2.339 us | } qemu-kvm-2632 [000] 816.223263: funcgraph_exit: 2.636 us | } qemu-kvm-2632 [000] 816.223272: sys_exit_open: 0xffffffffffffffff
The following commit should fix the issue: commit db98e7f67ea0d7699410f514f01947cef5128a6c Author: Michal Privoznik <mprivozn> AuthorDate: Thu Jan 4 11:11:53 2018 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Mon Jan 8 09:53:48 2018 +0100 qemuDomainAttachDeviceMknodHelper: Remove symlink before creating it https://bugzilla.redhat.com/show_bug.cgi?id=1528502 So imagine you have /dev/blah symlink which points to /dev/sda. You attach /dev/blah as disk to your domain. Libvirt correctly creates the /dev/blah -> /dev/sda symlink in the qemu namespace. However, then you detach the disk, change the symlink so that it points to /dev/sdb and tries to attach the disk again. This time, however, the attach fails (well, qemu attaches wrong disk) because the code assumes that symlinks don't change. Well they do. This is inspired by test fix written by Eduardo Habkost. Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: Andrea Bolognani <abologna> v3.10.0-132-gdb98e7f67
Michal, should this be MERGED?
(In reply to Nir Soffer from comment #2) > Michal, should this be MERGED? I guess Nir meant MODIFIED :) Also, which version of libvirt-python should we require now? For libvirt-daemon-kvm it's >= 3.10.0-132, right?
I think this bug is only waiting for new Fedora build. Cole?
libvirt-3.7.0-4.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-b22d46eabb
libvirt-3.7.0-4.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-b22d46eabb
Build pushed now, sorry for the delay
Michal, is this fix available in libvirt-3.7.0-4.fc27?
(In reply to Nir Soffer from comment #8) > Michal, is this fix available in libvirt-3.7.0-4.fc27? Yes, it is.
libvirt-3.7.0-4.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.