Bug 1003822 - [vdsm] vdsmd fails to start with "Command '/sbin/service' failed to execute" on host install
[vdsm] vdsmd fails to start with "Command '/sbin/service' failed to execute" ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.3.0
Unspecified Linux
high Severity high
: ---
: 3.3.0
Assigned To: Yaniv Bronhaim
Tareq Alayan
infra
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-03 05:47 EDT by Eyal Edri
Modified: 2016-02-10 14:10 EST (History)
7 users (show)

See Also:
Fixed In Version: is15 vdsm-4.12.0-138.gitab256be.el6ev
Doc Type: Bug Fix
Doc Text:
Several regressions were introduced as part of adding a daemonAdapter process that starts VDSM respawn with more management control, including not handling crashes of the adapter, and leaving the lock file on the system. This prevented new VDSM instances from starting, which caused host installation to fail. The code has been rewritten to handle these issues, and now VDSM starts correctly.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-21 11:14:53 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
vdsm,engine,sanlock,ovirt-host-deploy logs (5.11 MB, application/x-bzip)
2013-09-03 05:55 EDT, Eyal Edri
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 18804 None None None Never
oVirt gerrit 18806 None None None Never
oVirt gerrit 18809 None None None Never
oVirt gerrit 18810 None None None Never
oVirt gerrit 18875 None None None Never

  None (edit)
Description Eyal Edri 2013-09-03 05:47:52 EDT
Description of problem:

after patch http://gerrit.ovirt.org/#/c/15578 was merged, host install fails 
since vdsm fails to start due to lock file.

2013-09-03 10:26:02,416 - MainThread - plmanagement.error_fetcher - ERROR - Errors fetched from VDC(jenkins-automation-rpm-vm10.eng.lab.tlv.redhat.com): 2013-09-03 10:09:57,862 ERROR [org.ovirt.engine.core.bll.InstallerMessages] (VdsDeploy) Installation 10.35.148.66: Failed to execute stage 'Closing up': Command '/sbin/service' failed to execute
2013-09-03 10:09:57,869 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) Correlation ID: 5feed745, Call Stack: null, Custom Event ID: -1, Message: Failed to install Host cinteg14.ci.lab.tlv.redhat.com. Failed to execute stage 'Closing up': Command '/sbin/service' failed to execute.
2013-09-03 10:09:58,335 ERROR [org.ovirt.engine.core.utils.ssh.SSHDialog] (pool-5-thread-4) SSH error running command root@10.35.148.66:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/setup DIALOG/dialect=str:machine DIALOG/customization=bool:True': java.io.IOException: Command returned failure code 1 during SSH session 'root@10.35.148.66'


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 Eyal Edri 2013-09-03 05:55:03 EDT
Created attachment 793095 [details]
vdsm,engine,sanlock,ovirt-host-deploy logs
Comment 4 Yaniv Bronhaim 2013-09-04 06:28:17 EDT
The regression introduced as part of adding daemonAdapter process that starts vdsm respawn with more manage control. We don't handle crashes of this adapter as needed, and the lock file was left on the system. This prevented us from starting new instance of vdsm.

The crash in daemonAdapter was raised due to an import of missing package that was not declared as dependency for vdsm (python-argparse).

* The import dependency handled here - http://gerrit.ovirt.org/18809 (still have issues with this package over rhel)
* The patch http://gerrit.ovirt.org/18806 and http://gerrit.ovirt.org/18804 - were missed as part of all the changes introduced recently in init scripts.
* Currently we reverted - commit hash - f4aab4f , 8ca1e14 and ad15e45 to get back to previous status.
* Still in progress - fixing the lock http://gerrit.ovirt.org/18875 and handling daomonAdapter crashes http://gerrit.ovirt.org/18810
Comment 6 Tareq Alayan 2013-10-17 06:58:04 EDT
verified with vdsm-4.13.0-0.2.beta1.el6ev.x86_64
Comment 7 Charlie 2013-11-27 19:35:35 EST
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.
Comment 8 errata-xmlrpc 2014-01-21 11:14:53 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html

Note You need to log in before you can comment on or make changes to this bug.