Bug 1003822

Summary: [vdsm] vdsmd fails to start with "Command '/sbin/service' failed to execute" on host install
Product: Red Hat Enterprise Virtualization Manager Reporter: Eyal Edri <eedri>
Component: vdsmAssignee: Yaniv Bronhaim <ybronhei>
Status: CLOSED ERRATA QA Contact: Tareq Alayan <talayan>
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: aberezin, bazulay, iheim, lpeer, pstehlik, ybronhei, yeylon
Target Milestone: ---Keywords: Regression
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard: infra
Fixed In Version: is15 vdsm-4.12.0-138.gitab256be.el6ev Doc Type: Bug Fix
Doc Text:
Several regressions were introduced as part of adding a daemonAdapter process that starts VDSM respawn with more management control, including not handling crashes of the adapter, and leaving the lock file on the system. This prevented new VDSM instances from starting, which caused host installation to fail. The code has been rewritten to handle these issues, and now VDSM starts correctly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 16:14:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm,engine,sanlock,ovirt-host-deploy logs none

Description Eyal Edri 2013-09-03 09:47:52 UTC
Description of problem:

after patch http://gerrit.ovirt.org/#/c/15578 was merged, host install fails 
since vdsm fails to start due to lock file.

2013-09-03 10:26:02,416 - MainThread - plmanagement.error_fetcher - ERROR - Errors fetched from VDC(jenkins-automation-rpm-vm10.eng.lab.tlv.redhat.com): 2013-09-03 10:09:57,862 ERROR [org.ovirt.engine.core.bll.InstallerMessages] (VdsDeploy) Installation 10.35.148.66: Failed to execute stage 'Closing up': Command '/sbin/service' failed to execute
2013-09-03 10:09:57,869 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) Correlation ID: 5feed745, Call Stack: null, Custom Event ID: -1, Message: Failed to install Host cinteg14.ci.lab.tlv.redhat.com. Failed to execute stage 'Closing up': Command '/sbin/service' failed to execute.
2013-09-03 10:09:58,335 ERROR [org.ovirt.engine.core.utils.ssh.SSHDialog] (pool-5-thread-4) SSH error running command root.148.66:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/setup DIALOG/dialect=str:machine DIALOG/customization=bool:True': java.io.IOException: Command returned failure code 1 during SSH session 'root.148.66'


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Eyal Edri 2013-09-03 09:55:03 UTC
Created attachment 793095 [details]
vdsm,engine,sanlock,ovirt-host-deploy logs

Comment 4 Yaniv Bronhaim 2013-09-04 10:28:17 UTC
The regression introduced as part of adding daemonAdapter process that starts vdsm respawn with more manage control. We don't handle crashes of this adapter as needed, and the lock file was left on the system. This prevented us from starting new instance of vdsm.

The crash in daemonAdapter was raised due to an import of missing package that was not declared as dependency for vdsm (python-argparse).

* The import dependency handled here - http://gerrit.ovirt.org/18809 (still have issues with this package over rhel)
* The patch http://gerrit.ovirt.org/18806 and http://gerrit.ovirt.org/18804 - were missed as part of all the changes introduced recently in init scripts.
* Currently we reverted - commit hash - f4aab4f , 8ca1e14 and ad15e45 to get back to previous status.
* Still in progress - fixing the lock http://gerrit.ovirt.org/18875 and handling daomonAdapter crashes http://gerrit.ovirt.org/18810

Comment 6 Tareq Alayan 2013-10-17 10:58:04 UTC
verified with vdsm-4.13.0-0.2.beta1.el6ev.x86_64

Comment 7 Charlie 2013-11-28 00:35:35 UTC
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 8 errata-xmlrpc 2014-01-21 16:14:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html