Bug 1291731
Summary: | HE agent failed to start on RHEV-H after upgrade from 3.5 to 3.6 | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-hosted-engine-ha | Reporter: | Artyom <alukiano> | ||||
Component: | Agent | Assignee: | Martin Sivák <msivak> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Artyom <alukiano> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 1.3.3.3 | CC: | bugs, cshao, dfediuck, fdeutsch, gklein, huiwa, huzhao, mavital, sbonazzo, ycui, ylavi | ||||
Target Milestone: | ovirt-3.6.1 | Keywords: | Triaged | ||||
Target Release: | 1.3.3.4 | Flags: | rule-engine:
ovirt-3.6.z+
rule-engine: blocker+ ylavi: planning_ack+ dfediuck: devel_ack+ mavital: testing_ack+ |
||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | sla | ||||||
Fixed In Version: | ovirt-hosted-engine-ha-1.3.3.4 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-01-13 14:38:32 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1280310, 1284954, 1285700 | ||||||
Attachments: |
|
Looking into the logs this seems to be a RHEV-H only issue: ... MainThread::DEBUG::2015-12-15 14:20:42,327::upgrade::88::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_execute) executing: 'sudo -n unpersist /etc/ovirt-hosted-engine/hosted-engine.conf' MainThread::DEBUG::2015-12-15 14:20:42,373::upgrade::97::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_execute) rc: 1 ... Verified Upgrade from 3.5 - Red Hat Enterprise Virtualization Hypervisor release 7.1 (20151015.0.el7ev) ============================================================================================== ovirt-hosted-engine-ha-1.2.7.2-1.el7ev.noarch ovirt-hosted-engine-setup-1.2.6.1-1.el7ev.noarch vdsm-4.16.27-1.el7ev.x86_64 to 3.6 - Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151221.1.el7ev) ============================================================================================== ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch vdsm-4.17.13-1.el7ev.noarch 1) Install RHEV-H 3.5 2) Deploy hosted-engine on one host with NFS storage 3) Enable global maintenance(on host hosted-engine --set-maintenance --mode=global) 4) Upgrade engine to 3.6 5) Power off engine vm(hosted-engine --vm-poweroff) 6) Upgrade host to RHEV-H 3.6 via usb-key Upgrade succeed and after host upgrade ovirt-he-agent and ovirt-ha-broker up. oVirt 3.6.1 has been released, closing current release |
Created attachment 1106037 [details] sosreport Description of problem: Have one host HE environment, after upgrade from 3.5 to 3.6 ha-agent failed to start. Version-Release number of selected component (if applicable): 3.5 - Red Hat Enterprise Virtualization Hypervisor release 7.1 (20151015.0.el7ev) ================================= vdsm-4.16.27-1.el7ev.x86_64 ovirt-hosted-engine-setup-1.2.6.1-1.el7ev.noarch ovirt-hosted-engine-ha-1.2.7.2-1.el7ev.noarch 3.6 - Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151210.1.el7ev) ================================= vdsm-4.17.13-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.1.2-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.3-1.el7ev.noarch How reproducible: Always Steps to Reproduce: 1. Deploy 3.5 HE environment on one host 2. Put host to global maintenance and stop engine vm(hosted-engine --set-maintenance --mode=global && hosted-engine --vm-poweroff) 3. Reboot host and upgrade it to 3.6 4. Check hosted-engine services Actual results: ovirt-ha-agent start but after some time it stopped, because: MainThread::ERROR::2015-12-15 13:28:16,662::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Error executing: 1 - stdout: - stderr:sudo: a password is required ' - trying to restart agent Expected results: services up and hosted-engine pass upgrade without any errors Additional info: I also saw erros under vdsm log Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2201, in getAllTasksStatuses raise se.SpmStatusError() SpmStatusError: Not SPM: () Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage self.getPool(spUUID) File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool raise se.StoragePoolUnknown(spUUID) StoragePoolUnknown: Unknown pool id, pool not connected: ('330fc772-0e67-4b2c-80a0-ba76f9b8f348',)