Bug 1003822 - [vdsm] vdsmd fails to start with "Command '/sbin/service' failed to execute" on host install
Summary: [vdsm] vdsmd fails to start with "Command '/sbin/service' failed to execute" ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.3.0
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: ---
: 3.3.0
Assignee: Yaniv Bronhaim
QA Contact: Tareq Alayan
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-03 09:47 UTC by Eyal Edri
Modified: 2016-02-10 19:10 UTC (History)
7 users (show)

Fixed In Version: is15 vdsm-4.12.0-138.gitab256be.el6ev
Doc Type: Bug Fix
Doc Text:
Several regressions were introduced as part of adding a daemonAdapter process that starts VDSM respawn with more management control, including not handling crashes of the adapter, and leaving the lock file on the system. This prevented new VDSM instances from starting, which caused host installation to fail. The code has been rewritten to handle these issues, and now VDSM starts correctly.
Clone Of:
Environment:
Last Closed: 2014-01-21 16:14:53 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm,engine,sanlock,ovirt-host-deploy logs (5.11 MB, application/x-bzip)
2013-09-03 09:55 UTC, Eyal Edri
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0040 0 normal SHIPPED_LIVE vdsm bug fix and enhancement update 2014-01-21 20:26:21 UTC
oVirt gerrit 18804 0 None None None Never
oVirt gerrit 18806 0 None None None Never
oVirt gerrit 18809 0 None None None Never
oVirt gerrit 18810 0 None None None Never
oVirt gerrit 18875 0 None None None Never

Description Eyal Edri 2013-09-03 09:47:52 UTC
Description of problem:

after patch http://gerrit.ovirt.org/#/c/15578 was merged, host install fails 
since vdsm fails to start due to lock file.

2013-09-03 10:26:02,416 - MainThread - plmanagement.error_fetcher - ERROR - Errors fetched from VDC(jenkins-automation-rpm-vm10.eng.lab.tlv.redhat.com): 2013-09-03 10:09:57,862 ERROR [org.ovirt.engine.core.bll.InstallerMessages] (VdsDeploy) Installation 10.35.148.66: Failed to execute stage 'Closing up': Command '/sbin/service' failed to execute
2013-09-03 10:09:57,869 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) Correlation ID: 5feed745, Call Stack: null, Custom Event ID: -1, Message: Failed to install Host cinteg14.ci.lab.tlv.redhat.com. Failed to execute stage 'Closing up': Command '/sbin/service' failed to execute.
2013-09-03 10:09:58,335 ERROR [org.ovirt.engine.core.utils.ssh.SSHDialog] (pool-5-thread-4) SSH error running command root.148.66:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/setup DIALOG/dialect=str:machine DIALOG/customization=bool:True': java.io.IOException: Command returned failure code 1 during SSH session 'root.148.66'


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Eyal Edri 2013-09-03 09:55:03 UTC
Created attachment 793095 [details]
vdsm,engine,sanlock,ovirt-host-deploy logs

Comment 4 Yaniv Bronhaim 2013-09-04 10:28:17 UTC
The regression introduced as part of adding daemonAdapter process that starts vdsm respawn with more manage control. We don't handle crashes of this adapter as needed, and the lock file was left on the system. This prevented us from starting new instance of vdsm.

The crash in daemonAdapter was raised due to an import of missing package that was not declared as dependency for vdsm (python-argparse).

* The import dependency handled here - http://gerrit.ovirt.org/18809 (still have issues with this package over rhel)
* The patch http://gerrit.ovirt.org/18806 and http://gerrit.ovirt.org/18804 - were missed as part of all the changes introduced recently in init scripts.
* Currently we reverted - commit hash - f4aab4f , 8ca1e14 and ad15e45 to get back to previous status.
* Still in progress - fixing the lock http://gerrit.ovirt.org/18875 and handling daomonAdapter crashes http://gerrit.ovirt.org/18810

Comment 6 Tareq Alayan 2013-10-17 10:58:04 UTC
verified with vdsm-4.13.0-0.2.beta1.el6ev.x86_64

Comment 7 Charlie 2013-11-28 00:35:35 UTC
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 8 errata-xmlrpc 2014-01-21 16:14:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html


Note You need to log in before you can comment on or make changes to this bug.