Bug 1078309
| Summary: | Failed to reconfigure libvirt for VDSM | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] oVirt | Reporter: | Martin Sivák <msivak> | ||||
| Component: | ovirt-hosted-engine-setup | Assignee: | Sandro Bonazzola <sbonazzo> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Jiri Belka <jbelka> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3.4 | CC: | bugs, danken, didi, fsimonce, gklein, lveyde, msivak, rbalakri, stirabos, yeylon, ylavi | ||||
| Target Milestone: | --- | Keywords: | TestOnly | ||||
| Target Release: | 3.5.2 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | integration | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-04-29 06:18:50 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1193058 | ||||||
| Attachments: |
|
||||||
|
Description
Martin Sivák
2014-03-19 15:06:55 UTC
Can you please attach vdsm, libvirt and hosted-engine-setup logs? Thanks Created attachment 876773 [details]
ovirt-hosted-engine-setup log
2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.executeRaw:383 execute-result: ('/bin/vdsm-tool', 'configure', '--force'), rc=1
2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:441 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
SUCCESS: ssl configured to true. No conflicts
2014-03-19 16:23:17 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv plugin.execute:446 execute-output: ('/bin/vdsm-tool', 'configure', '--force') stderr:
Traceback (most recent call last):
File "/bin/vdsm-tool", line 153, in <module>
sys.exit(main())
File "/bin/vdsm-tool", line 150, in main
return tool_command[cmd]["command"](*args[1:])
File "/usr/lib64/python2.7/site-packages/vdsm/tool/configurator.py", line 221, in configure
service.service_stop(s)
File "/usr/lib64/python2.7/site-packages/vdsm/tool/service.py", line 369, in service_stop
return _runAlts(_srvStopAlts, srvName)
File "/usr/lib64/python2.7/site-packages/vdsm/tool/service.py", line 350, in _runAlts
"%s failed" % alt.func_name, out, err)
vdsm.tool.service.ServiceOperationError: ServiceOperationError: _systemctlStop failed
Job for sanlock.service canceled.
looks like an issue between sanlock and vdsm-tool.
danken, federico can you help figuring out what should be done for avoiding above condition?
Martin can you attach sanlock logs?
It is intentionally impossible to restart sanlock when it is holding a lock. Does the issue persists if the host is moved to maintenance mode before killing vdsm? (In reply to Dan Kenigsberg from comment #4) > It is intentionally impossible to restart sanlock when it is holding a lock. > Does the issue persists if the host is moved to maintenance mode before > killing vdsm? Since the user is installing hosted-engine here I assume there's no engine around able to move the host to maintenance. How this situation can be detected in order to abort or wait for the right moment in order to restart it? hosted-engine should call spmStop and later disconnectStoragePool in order to shutdown its storage usage cleanly. If you cannot do this, but you are sure that nothing on this host uses the protected resource currently held by sanlock or would ever need it, you can follow the script in https://bugzilla.redhat.com/show_bug.cgi?id=1035847#c23 . (In reply to Dan Kenigsberg from comment #6) > hosted-engine should call spmStop and later disconnectStoragePool in order > to shutdown its storage usage cleanly. > > If you cannot do this, but you are sure that nothing on this host uses the > protected resource currently held by sanlock or would ever need it, you can > follow the script in https://bugzilla.redhat.com/show_bug.cgi?id=1035847#c23 > . No storage pool can exist in that stage, if a storage pool is detected, hosted-engine will abort the setup. So sanlock is holding locks without a storage pool around. How are you sure that no storage pool exists? The reproduction steps ("Kill all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup") suggest that there has been a storage pool, and that it was never torn down properly.
Anyway, what is your `sanlock client status`? Let's be certain that locks are being held.
Uh.. I do not have the setup anymore. There might have been a storage pool from the previous installation. Is there an official way to clean up a host so when VDSM starts again it knows nothing and behaves as if it was a brand new host (except reinstalling the OS)? (In reply to Dan Kenigsberg from comment #8) > How are you sure that no storage pool exists? The reproduction steps ("Kill > all existing VMs, stop VDSM and try to run ovirt-hosted-engine-setup") > suggest that there has been a storage pool, and that it was never torn down > properly. Mmm, now that I look better at the logs, the error occur at late_setup stage and the check on existing storage pools is done calling getConnectedStoragePoolsList in customization stage, so you may be right. > Anyway, what is your `sanlock client status`? Let's be certain that locks > are being held. (In reply to Martin Sivák from comment #9) > Is there an official way to clean up a host so when VDSM starts again it > knows nothing and behaves as if it was a brand new host (except reinstalling > the OS)? I believe that killing all VMs, stopping spm and disconnecting from the pool is all that's needed - that's what happens when Engine moves a host to maintenance. In any case, reboot is as good as re-install. This is an automated message: This bug has been re-targeted from 3.4.2 to 3.5.0 since neither priority nor severity were high or urgent. Please re-target to 3.4.3 if relevant. I'm not able to reproduce anymore with ovirt-hosted-engine-setup-1.2.3-0.0.master.20150213134326.gitbd6a4ea.fc20.noarch vdsm-4.16.11-11.git39f1c15.fc20.x86_64 Moving to QA for further testing. ok, vt14.1 ovirt-hosted-engine-setup-1.2.2-2.el7ev.noarch vdsm-4.16.12.1-3.el7ev.x86_64 host was part of rhevm35 as spm, maintenance, and then hosted-engine --deploy went ok. ovirt 3.5.2 was GA'd. closing current release. |