1078309 – Failed to reconfigure libvirt for VDSM

Bug 1078309 - Failed to reconfigure libvirt for VDSM

Summary: Failed to reconfigure libvirt for VDSM

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	oVirt
Classification:	Retired
Component:	ovirt-hosted-engine-setup
Sub Component:
Version:	3.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	3.5.2
Assignee:	Sandro Bonazzola
QA Contact:	Jiri Belka
Docs Contact:
URL:
Whiteboard:	integration
Depends On:
Blocks:	1193058
TreeView+	depends on / blocked

Reported:	2014-03-19 15:06 UTC by Martin Sivák
Modified:	2015-04-29 06:18 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-04-29 06:18:50 UTC
oVirt Team:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
ovirt-hosted-engine-setup log (143.50 KB, text/x-log) 2014-03-20 10:42 UTC, Martin Sivák	no flags	Details
View All

Description Martin Sivák 2014-03-19 15:06:55 UTC

Description of problem:

The setup fails with:

[ ERROR ] Failed to execute stage 'Environment setup': Failed to reconfigure libvirt for VDSM

when trying ovirt-hosted-engine-setup on a host that had VDSM running.

Version-Release number of selected component (if applicable):

ovirt-hosted-engine-setup-1.2.0-0.0.master.20140319143545 with my git change to plugins/sanlock only. 

How reproducible:

Always. Kill all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup.

Actual results:

[ ERROR ] Failed to execute stage 'Environment setup': Failed to reconfigure libvirt for VDSM

Expected results:

Setup understands that the environment is already prepared for VDSM and continues.

Comment 1 Sandro Bonazzola 2014-03-19 15:37:02 UTC

Can you please attach vdsm, libvirt and hosted-engine-setup logs?
Thanks

Comment 2 Martin Sivák 2014-03-20 10:42:13 UTC

Created attachment 876773 [details]
ovirt-hosted-engine-setup log

Comment 3 Sandro Bonazzola 2014-04-17 13:06:26 UTC

2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.executeRaw:383 execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:441 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:

Checking configuration status...

SUCCESS: ssl configured to true. No conflicts

2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:446 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
Traceback (most recent call last):
  File "/bin/vdsm-tool", line 153, in <module>
    sys.exit(main())
  File "/bin/vdsm-tool", line 150, in main
    return tool_command[cmd]["command"](*args[1:])
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/configurator.py", line 221, in configure
    service.service_stop(s)
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/service.py", line 369, in service_stop
    return _runAlts(_srvStopAlts, srvName)
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/service.py", line 350, in _runAlts
    "%s failed" % alt.func_name, out, err)
vdsm.tool.service.ServiceOperationError: ServiceOperationError: _systemctlStop failed
Job for sanlock.service canceled.


looks like an issue between sanlock and vdsm-tool.
danken, federico can you help figuring out what should be done for avoiding above condition?

Martin can you attach sanlock logs?

Comment 4 Dan Kenigsberg 2014-04-17 13:39:11 UTC

It is intentionally impossible to restart sanlock when it is holding a lock. Does the issue persists if the host is moved to maintenance mode before killing vdsm?

Comment 5 Sandro Bonazzola 2014-04-17 14:52:40 UTC

(In reply to Dan Kenigsberg from comment #4)
> It is intentionally impossible to restart sanlock when it is holding a lock.
> Does the issue persists if the host is moved to maintenance mode before
> killing vdsm?

Since the user is installing hosted-engine here I assume there's no engine around able to move the host to maintenance.

How this situation can be detected in order to abort or wait for the right moment in order to restart it?

Comment 6 Dan Kenigsberg 2014-04-17 15:45:32 UTC

hosted-engine should call spmStop and later disconnectStoragePool in order to shutdown its storage usage cleanly.

If you cannot do this, but you are sure that nothing on this host uses the protected resource currently held by sanlock or would ever need it, you can follow the script in https://bugzilla.redhat.com/show_bug.cgi?id=1035847#c23 .

Comment 7 Sandro Bonazzola 2014-04-18 06:13:11 UTC

(In reply to Dan Kenigsberg from comment #6)
> hosted-engine should call spmStop and later disconnectStoragePool in order
> to shutdown its storage usage cleanly.
> 
> If you cannot do this, but you are sure that nothing on this host uses the
> protected resource currently held by sanlock or would ever need it, you can
> follow the script in https://bugzilla.redhat.com/show_bug.cgi?id=1035847#c23
> .

No storage pool can exist in that stage, if a storage pool is detected, hosted-engine will abort the setup. So sanlock is holding locks without a storage pool around.

Comment 8 Dan Kenigsberg 2014-04-18 07:30:39 UTC

How are you sure that no storage pool exists? The reproduction steps ("Kill all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup") suggest that there has been a storage pool, and that it was never torn down properly.

Anyway, what is your `sanlock client status`? Let's be certain that locks are being held.

Comment 9 Martin Sivák 2014-04-18 07:45:22 UTC

Uh.. I do not have the setup anymore. There might have been a storage pool from the previous installation.

Is there an official way to clean up a host so when VDSM starts again it knows nothing and behaves as if it was a brand new host (except reinstalling the OS)?

Comment 10 Sandro Bonazzola 2014-04-18 07:47:00 UTC

(In reply to Dan Kenigsberg from comment #8)
> How are you sure that no storage pool exists? The reproduction steps ("Kill
> all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup")
> suggest that there has been a storage pool, and that it was never torn down
> properly.

Mmm, now that I look better at the logs, the error occur at late_setup stage and the check on existing storage pools is done calling
getConnectedStoragePoolsList in customization stage, so you may be right.


> Anyway, what is your `sanlock client status`? Let's be certain that locks
> are being held.

Comment 11 Dan Kenigsberg 2014-04-18 08:43:59 UTC

(In reply to Martin Sivák from comment #9)
> Is there an official way to clean up a host so when VDSM starts again it
> knows nothing and behaves as if it was a brand new host (except reinstalling
> the OS)?

I believe that killing all VMs, stopping spm and disconnecting from the pool is all that's needed - that's what happens when Engine moves a host to maintenance. In any case, reboot is as good as re-install.

Comment 12 Sandro Bonazzola 2014-06-11 06:50:59 UTC

This is an automated message:
This bug has been re-targeted from 3.4.2 to 3.5.0 since neither priority nor severity were high or urgent. Please re-target to 3.4.3 if relevant.

Comment 13 Sandro Bonazzola 2015-02-20 13:08:17 UTC

I'm not able to reproduce anymore with 
 ovirt-hosted-engine-setup-1.2.3-0.0.master.20150213134326.gitbd6a4ea.fc20.noarch
 vdsm-4.16.11-11.git39f1c15.fc20.x86_64

Moving to QA for further testing.

Comment 14 Jiri Belka 2015-03-19 19:13:59 UTC

ok, vt14.1

ovirt-hosted-engine-setup-1.2.2-2.el7ev.noarch
vdsm-4.16.12.1-3.el7ev.x86_64

host was part of rhevm35 as spm, maintenance, and then hosted-engine --deploy went ok.

Comment 15 Eyal Edri 2015-04-29 06:18:50 UTC

ovirt 3.5.2 was GA'd. closing current release.

Note You need to log in before you can comment on or make changes to this bug.