Hide Forgot
Description of problem: Fail to do managedsave/save/dump with the guest after change the security-driver from selinux to none Version-Release number: libvirt-1.2.8-13.el7.x86_64 kernel-3.10.0-222.el7.x86_64 qemu-kvm-rhev-2.1.2-20.el7.x86_64 selinux-policy-3.13.1-16.el7.noarch How reproducible: 100% Steps to Reproduce: 1.Enable selinux in both system and the qemu.conf # getenforce Enforcing #cat /etc/libvirt/qemu.conf security_driver='selinux' 2.Start a normal guest #virsh start rhel7f # ps -efZ|grep qemu system_u:system_r:svirt_t:s0:c243,c794 qemu 10499 1 1 14:41 ? 00:00:17 /usr/libexec/qemu-kvm -name rhel7f 3.Disable the selinux in qemu.conf #vim /etc/libvirt/qemu.conf security_driver='none' #systemctl restart libvirtd 4.Do managedsave/save/dump with the guest, will get the unclear error # virsh managedsave rhel7f error: Failed to save domain rhel7f state error: internal error: unable to execute QEMU command 'getfd': No file descriptor supplied via SCM_RIGHTS # virsh save rhel7f /tmp/rhel7f.save error: Failed to save domain rhel7f to /tmp/rhel7f.save error: internal error: unable to execute QEMU command 'getfd': No file descriptor supplied via SCM_RIGHTS # virsh dump rhel7f /tmp/rhel7f.dump error: Failed to core dump domain rhel7f to /tmp/rhel7f.dump error: internal error: unable to execute QEMU command 'getfd': No file descriptor supplied via SCM_RIGHTS 5. Check the error info in libvirtd 2015-01-19 07:02:25.911+0000: 10778: debug : virJSONValueToString:1303 : result={"id":"libvirt-72","error":{"class":"GenericError","desc":"No file descriptor supplied via SCM_RIGHTS"}} 2015-01-19 07:02:25.911+0000: 10776: debug : virEventPollCalculateTimeout:340 : Calculate expiry of 2 timers 2015-01-19 07:02:25.911+0000: 10778: debug : qemuMonitorJSONCheckError:370 : unable to execute QEMU command {"execute":"getfd","arguments":{"fdname":"migrate"},"id":"libvirt-72"}: {"id":"libvirt-72","error":{"class":"GenericError","desc":"No file descriptor supplied via SCM_RIGHTS"}} 2015-01-19 07:02:25.911+0000: 10776: debug : virEventPollCalculateTimeout:348 : Got a timeout scheduled for 1421650950905 2015-01-19 07:02:25.911+0000: 10776: debug : virEventPollCalculateTimeout:361 : Schedule timeout then=1421650950905 now=1421650945911 2015-01-19 07:02:25.911+0000: 10776: debug : virEventPollCalculateTimeout:370 : Timeout at 1421650950905 due in 4994 ms 2015-01-19 07:02:25.911+0000: 10778: error : qemuMonitorJSONCheckError:381 : internal error: unable to execute QEMU command 'getfd': No file descriptor supplied via SCM_RIGHTS 6.it works well while do managedsave/save/dump with the guest after change the security_driver from none to selinux 7.Try it with RHEL6.7 host, The guest could do managedsave operation successfully afer change the security-driver from selinux to none Actual results: fail to do managedsave/save/dump and report an unclear error Expected results: if support do managedsave in such scenarios, then it should be done successfully, if unsupported do such operations, then it should report a clear error
fixed upstream: commit fb0b9a2cc50fb7a52eb4afea8ea48e39db74e45b Author: Erik Skultety <eskultet> Date: Tue May 5 13:24:41 2015 +0200 qemu: Log error if domain uses security driver which is not loaded When starting a domain, if a domain specifies security drivers we do not have loaded, we fail. However we don't check for this during reconnect, so any operation relying on security driver functionality would fail. If someone e.g. starts a domain with selinux driver loaded, then they change the security driver to 'none' in config, restart the daemon and call dump/save/.., QEMU will return an error. As we shouldn't kill the domain, we should at least log an error to let the user know that domain reconnect wasn't completely clean. v1.2.15-120-gfb0b9a2
Hi Erik Try to verify this bug with libvirt-1.2.17-2.el7.x86_64, fould that still could hit the issue in comment0, can you help check it ? thanks
What kind of issue? The error returned from QEMU hasn't changed, what the patch in https://bugzilla.redhat.com/show_bug.cgi?id=1183893#c1 does, is that it logs an error to the daemon log (you might want to check if I'm correct, there should be an entry "Unable to find security driver for model selinux"), but it doesn't change the behaviour in any way and that's because the machine from user's point of view is functional (user experience is not affected), from libvirt's point of view it's crippled, but destroying the domain forcefully isn't a great solution as some users may be still using it. I don't really think this is a bug actually, switching security drivers while having active domains is wrong and should be avoided and if one does that, they should expect some consequences (like libvirt being unable to properly manage the domain).
Hi Erik Sorry to miss your comments previously, yes, it's not a great solution to destroy the domain forcefully while change the security driver from selinux to none. however, i think we could improve its error since it's unclear and i think the error like following might be better then comment0's error, how do you think about it? "Unable to find security driver for model selinux"
Well, if it was a libvirt error, then we could think of providing some logic to handle this case with more specific error message. But it isn't, the thing here is, if QEMU indicates an error to libvirt, we take the error message QEMU provided through monitor (NOT mangling it in any way) propagating it upwards, so that virsh (in this case) outputs exactly what you see right now. Parsing strings just to squeeze out some information would be irrational in my opinion, one way to do it would be to read the "error class" from JSON string. However QEMU classified this error under GenericError which could be almost anything, so there probably isn't any efficient way to improve the error message. Moreover, as comment4 states, changing security drivers with active domains isn't without consequences. Comment4 also describes what the patch really does.
Hi Erik Thanks for your explanation, ok, the error info is ok for me, but still have a doubt that in your comment4 said that what the patch in https://bugzilla.redhat.co/show_bug.cgi?id=1183893#c1 does, is that it logs an error to the daemon log, in fact, the error have already been included in libvirtd log while reported this bug. So check the patch in comment1, found following code added + /* if domain requests security driver we haven't loaded, report error, but + * do not kill the domain + */ + ignore_value(virSecurityManagerCheckAllLabel(driver->securityManager, + obj->def)); + ; if i didn't misunderstand it, i think the checkpoints that the guest didn't shutdown/disappear automatically and report the upper error were ok for this bug verification while try it with the reproduce steps, right? can you help check anything additional need to do? thanks
Created attachment 1077001 [details] 1.2.15 daemon log
Created attachment 1077013 [details] 1.2.17 daemon log
(In reply to zhenfeng wang from comment #7) > Hi Erik > Thanks for your explanation, ok, the error info is ok for me, but still have > a doubt that in your comment4 said that what the patch in > https://bugzilla.redhat.co/show_bug.cgi?id=1183893#c1 does, is that it logs > an error to the daemon log, in fact, the error have already been included in > libvirtd log while reported this bug. So check the patch in comment1, found > following code added Okay, so I created 2 attachments containing complete logs from 1.2.15 and 1.2.17 daemon respectively. To demonstrate the difference the above proposed patch (comment1) does, let's have a look at the most important part: 1.2.15: 2015-09-25 08:25:55.790+0000: 1322: debug : virDomainPCIAddressReserveAddr:317 : Reserving PCI slot 0000:00:03.0 (multifunction='off') 2015-09-25 08:25:55.790+0000: 1322: debug : virDomainPCIAddressReserveAddr:317 : Reserving PCI slot 0000:00:01.0 (multifunction='off') 2015-09-25 08:25:55.790+0000: 1322: debug : qemuDomainObjEnterMonitorInternal:1598 : Entering monitor (mon=0x7fa7f8000bd0 vm=0x7fa800125c60 name=f20live2) 2015-09-25 08:25:55.790+0000: 1322: info : virObjectRef:296 : OBJECT_REF: obj=0x7fa7f8000bd0 1.2.17: 2015-09-25 08:46:31.462+0000: 3246: debug : virDomainPCIAddressReserveAddr:327 : Reserving PCI slot 0000:00:03.0 (multifunction='off') 2015-09-25 08:46:31.462+0000: 3246: debug : virDomainPCIAddressReserveAddr:327 : Reserving PCI slot 0000:00:01.0 (multifunction='off') !!!2015-09-25 08:46:31.462+0000: 3246: error : virSecurityManagerCheckModel:725 : unsupported configuration: Unable to find security driver for model selinux!!! 2015-09-25 08:46:31.462+0000: 3246: debug : qemuDomainObjEnterMonitorInternal:1752 : Entering monitor (mon=0x7f9c10001060 vm=0x7f9c24245fd0 name=f20live2) 2015-09-25 08:46:31.462+0000: 3246: info : virObjectRef:296 : OBJECT_REF: obj=0x7f9c10001060 Notice the part emphasized with '!!!', when QEMU returns this error "2015-09-25 08:26:40.079+0000: 1271: debug : virJSONValueFromString:1531 : string={"id": "libvirt-12", "error": {"class": "GenericError", "desc": "No file descriptor supplied via SCM_RIGHTS"}} ", one can now search the log for all potential causes and a failed attempt to load a security driver can imho be a good hint.
Thanks for Erik's detailed explanation, sorry to misunderstand it previously, currently could get the expect error info while switch the security_driver from selinux to none while start guest in selinxu mode , also the guest didn't disappear or shutoff while do some operation, like mangedsave/save/dump; BTW, also try another scenario that start a guest while security_driver=none, then switch the security_driver from "none" to "selinux", the guest works well. According to the upper descriptions, mark this bug verifed with libvirt-1.2.17-11.el7.x86_64.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2202.html