Description of problem: Hosted-engine --deploy failed by reporting "Failed to execute stage 'Environment setup': Command '/bin/systemctl' failed to execute" error. Version-Release number of selected component (if applicable): rhev-hypervisor7-7.0-20140807.0.iso ovirt-node-3.1.0-0.6.20140731git2c8e71f.el7.noarch ovirt-node-plugin-hosted-engine-0.1.0-0.0.x86_64 How reproducible: 100% Steps to Reproduce: 1. Install rhev-hypervisor7-7.0-20140807.0.iso 2. Run #hosted-engine --deploy in shell 3. Run #ovirt-hosted-engine-setup in shell Actual results: 1. After step2, Hosted-engine --deploy failed by reporting "Failed to execute stage 'Environment setup': Command '/bin/systemctl' failed to execute" error. 2. After step3, it report the same error as step2. Expected results: 1. Hosted-engine --deploy can successful. Additional info: Add keyword "test_blocker" due to this bug blocked our testing on hosted-engine feature.
Hey Hui, Could you help to provide the ovirt-hosted-engine-setup version here? Thanks Ying
Provide more version info: ovirt-hosted-engine-ha-1.2.1-0.2.ovirtbeta2.el7.noarch ovirt-hosted-engine-setup-1.2.0-0.1.ovirtbeta2.el7.noarch
Fabian looks like hosted engine and node doesn't work too well together... wanghui, we need hosted-engine, vdsm, supervdsm, sanlock and libvirt logs regarding the same time interval of the error you got. Can you attach them?
Can you reproduce on clean CentOS / RHEL 7 environment (not node)?
(In reply to Sandro Bonazzola from comment #3) > Fabian looks like hosted engine and node doesn't work too well together... Any idea what might be going wrong?
No, waiting for logs in order to try to figure out.
Created attachment 925612 [details] Provide more log files according to comment#3
Weird, you have vdsm sources in vdsm log directory... BTW, setup logs shows: 2014-08-11 05:07:03 DEBUG otopi.context context._executeMethod:152 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 142, in _executeMethod File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/system/vdsmenv.py", line 155, in _late_setup File "/usr/share/otopi/plugins/otopi/services/systemd.py", line 138, in state File "/usr/share/otopi/plugins/otopi/services/systemd.py", line 77, in _executeServiceCommand File "/usr/lib/python2.7/site-packages/otopi/plugin.py", line 871, in execute RuntimeError: Command '/bin/systemctl' failed to execute vdsm.log is empty. supervdsm.log only shows: MainThread::DEBUG::2014-08-11 05:07:03,872::netconfpersistence::134::root::(_getConfigs) Non-existing config set. libvirt logs look ok. Fabian, is it possible that on rhev-hypervisor7, vdsm.log is not writable? This may cause vdsmd to fail start.
(In reply to Sandro Bonazzola from comment #8) … > Fabian, is it possible that on rhev-hypervisor7, vdsm.log is not writable? > This may cause vdsmd to fail start. After booting a RHEVH7 image I see that /var/log/ is mounted and writable. But this could be a race. Systemd is booting the services in parallel if no dependency is set. That can lead to a situation where vdsm is started before /var/log is writable, and thus can't write the logfiles and subsequently fails to start. Could you provide the output of $ systemd-analyze plot > boot.svg Any service which is persisting or relying on Node specific features should have a dependency (Requires= and After=) on ovirt-early.service. Would it help if Node introduced a node-ready.target for easier consumption?
Moving needinfo to reporter.
Fabian, hosted-engine --deploy is called once the system is up. So it shouldn't be a race condition on systemd.
(In reply to Sandro Bonazzola from comment #11) > Fabian, hosted-engine --deploy is called once the system is up. > So it shouldn't be a race condition on systemd. You are right. I also wonder about this snippet now: > MainThread::DEBUG::2014-08-11 05:07:03,872::netconfpersistence::134::root::(_getConfigs) Non-existing config set. Toni, can you maybe say something about this log snippet?
Hey Leonid, As you are QA owner of this component, could you please help to reproduce this issue on RHEL 7 environment, and provide more useful info and status for this bug. Thanks.
Created attachment 927752 [details] the output of #systemd-analyze plot > boot.svg according to comment#9
Set back the needinfo due to remove them by mistake
The log message might be because of a bug that was clearing the network upgrade file. That should not happen with the latest 3.5. If the message still shows up, please let me know.
wanghui we're going to rebuild downstream packages for QE next week, please tru to reproduce with the new packages once they'll be available.
(In reply to Sandro Bonazzola from comment #18) > wanghui we're going to rebuild downstream packages for QE next week, please > tru to reproduce with the new packages once they'll be available. Checking brew, ovirt-hosted-engine-setup-1.2.0-0.2.master.el7 is the latest in brewweb, so we need to rebuild downstream rhevh 7.0 for rhev 3.5 build to reproduce this bug again. https://brewweb.devel.redhat.com/buildinfo?buildID=379706
As comment 18 and 19, we got the new rhevh build today, rhev-hypervisor7-7.0-20140827.0.iso, but we encounter the new testblocker bug 1134873, need to check more.
(In reply to Sandro Bonazzola from comment #18) > wanghui we're going to rebuild downstream packages for QE next week, please > tru to reproduce with the new packages once they'll be available. Hi sandro, I have tested rhev-hypervisor7-7.0-20140827.0.iso, the issue in this bug is fixed. But I encounter another bug 1134873 as comment#20 said. Please help to check it. Thanks Hui Wang
Moving to ON_QA as per comment #21. Will follow up bug #1134873 in its own bug report.
vt2.2 is first downstream build but doesn't contain rhevh7. please move to ON_QA when official rhevh build will be done, thanks.
Fabian, wasn't rhevh7 built for vt2.2?
official build in brew: https://brewweb.devel.redhat.com/buildinfo?buildID=381633 install rhevh with enforcing=0 cmdline to workaround some selinux issues. This issue in description is gone. Test version: rhev-hypervisor7-7.0-20140904.0.iso ovirt-node-3.1.0-0.10.20140904gitb828c37.el7.noarch ovirt-hosted-engine-setup-1.2.0-0.2.master.el7.noarch ovirt-hosted-engine-ha-1.2.1-0.3.master.el7.noarch ovirt-node-plugin-hosted-engine-0.1.0-0.0.x86_64 ovirt-host-deploy-1.3.0-0.0.1.master.el7.noarch
vt2.2 still doesn't have hypervisor.
Fabian, please move back to QA once a new rhev-h iso is available.
Tested version: rhev-hypervisor7-7.0-20140926.0.iso ovirt-node-3.1.0-0.17.20140925git29c3403.el7.noarch ovirt-host-deploy-1.3.0-0.0.4.master.el7.noarch ovirt-host-deploy-offline-1.3.0-0.0.2.master.el7.x86_64 ovirt-hosted-engine-setup-1.2.0-1.el7.noarch ovirt-hosted-engine-ha-1.2.1-1.el7.noarch Test steps: 1. Clean install rhev-hypervisor7-7.0-20140926.0.iso 2. Configure network with ipv4 dhcp mode 3. Run #hosted-engine --deploy in shell Test result: 1. After step3, it reports "Failed to execute stage 'Environment setup': Command '/bin/systemctl' failed to execute". So this issue is not fixed in rhev-hypervisor7-7.0-20140926.0.iso. Change the status from ON_QA to Assigned. Thanks, Hui Wang
Hui Wang, could you please try to reproduce this with enforcing=0?
(In reply to Fabian Deutsch from comment #32) > Hui Wang, could you please try to reproduce this with enforcing=0? Hi Fabian, It works when set enforcing=0. No such issue after set enforcing=0 then. Thanks Hui Wang
Hui Wang, could you please attach the logs /var/log/audit.log!
Created attachment 942237 [details] provide audit.log as comment#34 asked provide audit.log as comment#34 asked when enforcing=0
From the provided audit.log, it seams that SELinux denies sanlock to open its log file and this could be enough to prevent it from starting generating than this bug.
Moving this to ovirt-node. It seems that all files in /var/log are labeled as auditd_log_t, this is causing this, and other, denials - and prevents hosted engine from functioning correctly.
Tested version: rhev-hypervisor7-7.0-20141212.0.iso ovirt-node-3.1.0-0.34.20141210git0c9c493.el7.noarch Test steps: 1. Clean install rhev-hypervisor7-7.0-20141212.0.iso 2. Configure network with ipv4 dhcp mode 3. Run #hosted-engine --deploy in shell Test result: 1. After step3, it can deploy the hosted engine without error now. So this issue is fixed in rhev-hypervisor7-7.0-20141212.0.iso. Change the status from ON_QA to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0161.html