Bug 1291731

Summary: HE agent failed to start on RHEV-H after upgrade from 3.5 to 3.6
Product: [oVirt] ovirt-hosted-engine-ha Reporter: Artyom <alukiano>
Component: AgentAssignee: Martin Sivák <msivak>
Status: CLOSED CURRENTRELEASE QA Contact: Artyom <alukiano>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 1.3.3.3CC: bugs, cshao, dfediuck, fdeutsch, gklein, huiwa, huzhao, mavital, sbonazzo, ycui, ylavi
Target Milestone: ovirt-3.6.1Keywords: Triaged
Target Release: 1.3.3.4Flags: rule-engine: ovirt-3.6.z+
rule-engine: blocker+
ylavi: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+
Hardware: x86_64   
OS: Linux   
Whiteboard: sla
Fixed In Version: ovirt-hosted-engine-ha-1.3.3.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-13 14:38:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1280310, 1284954, 1285700    
Attachments:
Description Flags
sosreport none

Description Artyom 2015-12-15 14:02:14 UTC
Created attachment 1106037 [details]
sosreport

Description of problem:
Have one host HE environment, after upgrade from 3.5 to 3.6 ha-agent failed to start. 

Version-Release number of selected component (if applicable):
3.5 - Red Hat Enterprise Virtualization Hypervisor release 7.1 (20151015.0.el7ev)
=================================
vdsm-4.16.27-1.el7ev.x86_64
ovirt-hosted-engine-setup-1.2.6.1-1.el7ev.noarch
ovirt-hosted-engine-ha-1.2.7.2-1.el7ev.noarch

3.6 - Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151210.1.el7ev)
=================================
vdsm-4.17.13-1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.1.2-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.3-1.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy 3.5 HE environment on one host
2. Put host to global maintenance and stop engine vm(hosted-engine --set-maintenance --mode=global && hosted-engine --vm-poweroff)
3. Reboot host and upgrade it to 3.6
4. Check hosted-engine services

Actual results:
ovirt-ha-agent start but after some time it stopped, because:
MainThread::ERROR::2015-12-15 13:28:16,662::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Error executing: 1 - stdout: - stderr:sudo: a password is required
' - trying to restart agent


Expected results:
services up and hosted-engine pass upgrade without any errors

Additional info:
I also saw erros under vdsm log
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2201, in getAllTasksStatuses
    raise se.SpmStatusError()
SpmStatusError: Not SPM: ()

Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3203, in prepareImage
    self.getPool(spUUID)
  File "/usr/share/vdsm/storage/hsm.py", line 314, in getPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: ('330fc772-0e67-4b2c-80a0-ba76f9b8f348',)

Comment 1 Doron Fediuck 2015-12-15 14:35:28 UTC
Looking into the logs this seems to be a RHEV-H only issue:
...
MainThread::DEBUG::2015-12-15 14:20:42,327::upgrade::88::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_execute) executing: 'sudo -n unpersist /etc/ovirt-hosted-engine/hosted-engine.conf'
MainThread::DEBUG::2015-12-15 14:20:42,373::upgrade::97::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_execute) rc: 1
...

Comment 2 Artyom 2015-12-22 15:54:16 UTC
Verified
Upgrade from 3.5 - Red Hat Enterprise Virtualization Hypervisor release 7.1 (20151015.0.el7ev)
==============================================================================================
ovirt-hosted-engine-ha-1.2.7.2-1.el7ev.noarch
ovirt-hosted-engine-setup-1.2.6.1-1.el7ev.noarch
vdsm-4.16.27-1.el7ev.x86_64

to 3.6 - Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151221.1.el7ev)
==============================================================================================
ovirt-hosted-engine-setup-1.3.1.3-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch
vdsm-4.17.13-1.el7ev.noarch

1) Install RHEV-H 3.5
2) Deploy hosted-engine on one host with NFS storage
3) Enable global maintenance(on host hosted-engine --set-maintenance --mode=global)
4) Upgrade engine to 3.6
5) Power off engine vm(hosted-engine --vm-poweroff)
6) Upgrade host to RHEV-H 3.6 via usb-key

Upgrade succeed and after host upgrade ovirt-he-agent and ovirt-ha-broker up.

Comment 3 Sandro Bonazzola 2016-01-13 14:38:32 UTC
oVirt 3.6.1 has been released, closing current release