Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1078309

Summary: Failed to reconfigure libvirt for VDSM
Product: [Retired] oVirt Reporter: Martin Sivák <msivak>
Component: ovirt-hosted-engine-setupAssignee: Sandro Bonazzola <sbonazzo>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Belka <jbelka>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.4CC: bugs, danken, didi, fsimonce, gklein, lveyde, msivak, rbalakri, stirabos, yeylon, ylavi
Target Milestone: ---Keywords: TestOnly
Target Release: 3.5.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: integration
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-04-29 06:18:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1193058    
Attachments:
Description Flags
ovirt-hosted-engine-setup log none

Description Martin Sivák 2014-03-19 15:06:55 UTC
Description of problem:

The setup fails with:

[ ERROR ] Failed to execute stage 'Environment setup': Failed to reconfigure libvirt for VDSM

when trying ovirt-hosted-engine-setup on a host that had VDSM running.

Version-Release number of selected component (if applicable):

ovirt-hosted-engine-setup-1.2.0-0.0.master.20140319143545 with my git change to plugins/sanlock only. 

How reproducible:

Always. Kill all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup.

Actual results:

[ ERROR ] Failed to execute stage 'Environment setup': Failed to reconfigure libvirt for VDSM

Expected results:

Setup understands that the environment is already prepared for VDSM and continues.

Comment 1 Sandro Bonazzola 2014-03-19 15:37:02 UTC
Can you please attach vdsm, libvirt and hosted-engine-setup logs?
Thanks

Comment 2 Martin Sivák 2014-03-20 10:42:13 UTC
Created attachment 876773 [details]
ovirt-hosted-engine-setup log

Comment 3 Sandro Bonazzola 2014-04-17 13:06:26 UTC
2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.executeRaw:383 execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:441 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:

Checking configuration status...

SUCCESS: ssl configured to true. No conflicts

2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:446 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
Traceback (most recent call last):
  File "/bin/vdsm-tool", line 153, in <module>
    sys.exit(main())
  File "/bin/vdsm-tool", line 150, in main
    return tool_command[cmd]["command"](*args[1:])
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/configurator.py", line 221, in configure
    service.service_stop(s)
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/service.py", line 369, in service_stop
    return _runAlts(_srvStopAlts, srvName)
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/service.py", line 350, in _runAlts
    "%s failed" % alt.func_name, out, err)
vdsm.tool.service.ServiceOperationError: ServiceOperationError: _systemctlStop failed
Job for sanlock.service canceled.


looks like an issue between sanlock and vdsm-tool.
danken, federico can you help figuring out what should be done for avoiding above condition?

Martin can you attach sanlock logs?

Comment 4 Dan Kenigsberg 2014-04-17 13:39:11 UTC
It is intentionally impossible to restart sanlock when it is holding a lock. Does the issue persists if the host is moved to maintenance mode before killing vdsm?

Comment 5 Sandro Bonazzola 2014-04-17 14:52:40 UTC
(In reply to Dan Kenigsberg from comment #4)
> It is intentionally impossible to restart sanlock when it is holding a lock.
> Does the issue persists if the host is moved to maintenance mode before
> killing vdsm?

Since the user is installing hosted-engine here I assume there's no engine around able to move the host to maintenance.

How this situation can be detected in order to abort or wait for the right moment in order to restart it?

Comment 6 Dan Kenigsberg 2014-04-17 15:45:32 UTC
hosted-engine should call spmStop and later disconnectStoragePool in order to shutdown its storage usage cleanly.

If you cannot do this, but you are sure that nothing on this host uses the protected resource currently held by sanlock or would ever need it, you can follow the script in https://bugzilla.redhat.com/show_bug.cgi?id=1035847#c23 .

Comment 7 Sandro Bonazzola 2014-04-18 06:13:11 UTC
(In reply to Dan Kenigsberg from comment #6)
> hosted-engine should call spmStop and later disconnectStoragePool in order
> to shutdown its storage usage cleanly.
> 
> If you cannot do this, but you are sure that nothing on this host uses the
> protected resource currently held by sanlock or would ever need it, you can
> follow the script in https://bugzilla.redhat.com/show_bug.cgi?id=1035847#c23
> .

No storage pool can exist in that stage, if a storage pool is detected, hosted-engine will abort the setup. So sanlock is holding locks without a storage pool around.

Comment 8 Dan Kenigsberg 2014-04-18 07:30:39 UTC
How are you sure that no storage pool exists? The reproduction steps ("Kill all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup") suggest that there has been a storage pool, and that it was never torn down properly.

Anyway, what is your `sanlock client status`? Let's be certain that locks are being held.

Comment 9 Martin Sivák 2014-04-18 07:45:22 UTC
Uh.. I do not have the setup anymore. There might have been a storage pool from the previous installation.

Is there an official way to clean up a host so when VDSM starts again it knows nothing and behaves as if it was a brand new host (except reinstalling the OS)?

Comment 10 Sandro Bonazzola 2014-04-18 07:47:00 UTC
(In reply to Dan Kenigsberg from comment #8)
> How are you sure that no storage pool exists? The reproduction steps ("Kill
> all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup")
> suggest that there has been a storage pool, and that it was never torn down
> properly.

Mmm, now that I look better at the logs, the error occur at late_setup stage and the check on existing storage pools is done calling
getConnectedStoragePoolsList in customization stage, so you may be right.


> Anyway, what is your `sanlock client status`? Let's be certain that locks
> are being held.

Comment 11 Dan Kenigsberg 2014-04-18 08:43:59 UTC
(In reply to Martin Sivák from comment #9)
> Is there an official way to clean up a host so when VDSM starts again it
> knows nothing and behaves as if it was a brand new host (except reinstalling
> the OS)?

I believe that killing all VMs, stopping spm and disconnecting from the pool is all that's needed - that's what happens when Engine moves a host to maintenance. In any case, reboot is as good as re-install.

Comment 12 Sandro Bonazzola 2014-06-11 06:50:59 UTC
This is an automated message:
This bug has been re-targeted from 3.4.2 to 3.5.0 since neither priority nor severity were high or urgent. Please re-target to 3.4.3 if relevant.

Comment 13 Sandro Bonazzola 2015-02-20 13:08:17 UTC
I'm not able to reproduce anymore with 
 ovirt-hosted-engine-setup-1.2.3-0.0.master.20150213134326.gitbd6a4ea.fc20.noarch
 vdsm-4.16.11-11.git39f1c15.fc20.x86_64

Moving to QA for further testing.

Comment 14 Jiri Belka 2015-03-19 19:13:59 UTC
ok, vt14.1

ovirt-hosted-engine-setup-1.2.2-2.el7ev.noarch
vdsm-4.16.12.1-3.el7ev.x86_64

host was part of rhevm35 as spm, maintenance, and then hosted-engine --deploy went ok.

Comment 15 Eyal Edri 2015-04-29 06:18:50 UTC
ovirt 3.5.2 was GA'd. closing current release.