Bug 1275682

Summary: sasldblistusers2 throws exception on vdsmd startup prevents HE installation to finish
Product: [oVirt] vdsm Reporter: Simone Tiraboschi <stirabos>
Component: ToolsAssignee: Yaniv Bronhaim <ybronhei>
Status: CLOSED NOTABUG QA Contact: Aharon Canan <acanan>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.17.10CC: alonbl, bugs, msivak, rgolan, sbonazzo, stirabos
Target Milestone: ovirt-3.6.1Flags: ybronhei: ovirt-3.6.z?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-12 09:57:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simone Tiraboschi 2015-10-27 13:43:18 UTC
Description of problem:
'vdsm-tool configure  --force' exits with error code 0 also when it fails.
host-deploy or any other script cannot detect issue there.

[root@bear ~]# vdsm-tool configure  --force

Checking configuration status...

Current revision of multipath.conf detected, preserving
libvirt is already configured for vdsm
FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=True requires the following changes:
libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
qemu.conf: spice_tls=1.

Running configure...
Reconfiguration of libvirt is done.

Done configuring modules to VDSM.
[root@bear ~]# echo $?
0




Version-Release number of selected component (if applicable):
 4.17.10-0.el7.centos

How reproducible:
100%

Steps to Reproduce:
1. use an unsupported libvirt conf file
2. call vdsm-tool configure  --force
3. check it's exit code

Actual results:
It says FAILED: conflicting vdsm and libvirt-qemu tls configuration.
but exit code is 0.
VDSM will not restart with:
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: Current revision of multipath.conf detected, preserving
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: libvirt is already configured for vdsm
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: vdsm: Running validate_configuration
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: Error:  Config is not valid. Check conf files
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: FAILED: conflicting vdsm and libvirt-qemu tls configuration.
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: vdsm.conf with ssl=True requires the following changes:
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: qemu.conf: spice_tls=1.
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: Modules libvirt contains invalid configuration
ott 27 12:03:52 bear.usersys.redhat.com vdsmd_init_common.sh[6745]: vdsm: stopped during execute validate_configuration task (task returned with error code 1).
ott 27 12:03:52 bear.usersys.redhat.com systemd[1]: Received SIGCHLD from PID 6745 (vdsmd_init_comm).
ott 27 12:03:52 bear.usersys.redhat.com systemd[1]: Child 6745 (vdsmd_init_comm) died (code=exited, status=1/FAILURE)
ott 27 12:03:52 bear.usersys.redhat.com systemd[1]: Child 6745 belongs to vdsmd.service
ott 27 12:03:52 bear.usersys.redhat.com systemd[1]: vdsmd.service: control process exited, code=exited status=1


Expected results:
It successfully configures cause we have force option.
It will report exit code according to the real status.  

Additional info:

Comment 1 Red Hat Bugzilla Rules Engine 2015-10-27 16:30:11 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 2 Simone Tiraboschi 2015-11-03 15:59:51 UTC
Here it happened again on hosted-engine CI job on a fresh VM:
http://jenkins-ci.eng.lab.tlv.redhat.com/job/hosted_engine_3.6_el7_dynamicip_nfs3_install_on_latest/111/artifact/logs/ovirt-hosted-engine-setup-20151103065635-kqqjo7.log


2015-11-03 06:56:42 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:936 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:

Checking configuration status...

multipath requires configuration
libvirt is not configured for vdsm yet
FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=True requires the following changes:
libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
qemu.conf: spice_tls=1.

Running configure...
Reconfiguration of sanlock is done.
Reconfiguration of sebool is done.
Reconfiguration of multipath is done.
Reconfiguration of passwd is done.
Reconfiguration of certificates is done.
Reconfiguration of libvirt is done.

Done configuring modules to VDSM.

2015-11-03 06:56:42 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:941 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:


2015-11-03 06:56:42 DEBUG otopi.plugins.otopi.services.systemd systemd.state:145 starting service vdsmd
2015-11-03 06:56:42 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:828 execute: ('/bin/systemctl', 'start', 'vdsmd.service'), executable='None', cwd='None', env=None
2015-11-03 06:56:42 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:878 execute-result: ('/bin/systemctl', 'start', 'vdsmd.service'), rc=1
2015-11-03 06:56:42 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:936 execute-output: ('/bin/systemctl', 'start', 'vdsmd.service') stdout:


2015-11-03 06:56:42 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'start', 'vdsmd.service') stderr:
Job for vdsmd.service failed because start of the service was attempted too often. See "systemctl status vdsmd.service" and "journalctl -xe" for details.
To force a start use "systemctl reset-failed vdsmd.service" followed by "systemctl start vdsmd.service" again.

2015-11-03 06:56:42 DEBUG otopi.context context._executeMethod:156 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/system/vdsmenv.py", line 138, in _late_setup
    state=True
  File "/usr/share/otopi/plugins/otopi/services/systemd.py", line 156, in state
    service=name,
RuntimeError: Failed to start service 'vdsmd'

Comment 3 Yaniv Bronhaim 2015-11-04 13:11:36 UTC
Its not related to the tool - the tool existed fine with 0 because it successfully configured all modules. but vdsm cannot start because the configuration check throws exception "Nov  3 06:56:33 hehost sasldblistusers2: _sasldb_getkeyhandle has failed" - can you verify if the file /usr/sbin/sasldblistusers2 exist ?
can you try yourself to run "sasldblistusers2 -f vdsm@ovirt" ? this checks that vdsm configured sasl password for libvirt .. but the command raises exception there for some reason that we need to understand.

Comment 4 Simone Tiraboschi 2015-11-12 09:44:04 UTC
OK, saw again:

[root@bear ~]# sasldblistusers2 -f vdsm@ovirt
listusers failed
[root@bear ~]# ls -l /usr/sbin/sasldblistusers2
-rwxr-xr-x. 1 root root 19712 16 lug 15.31 /usr/sbin/sasldblistusers2

Comment 5 Yaniv Bronhaim 2015-11-12 09:57:54 UTC
After checking the host the issue was found as NOTABUG. If such issue is raised again during HE installation please reopen.