Bug 1397572 - The upgrade procedure fails if a 4.0 is added to a HE 3.5 cluster just partially upgraded to 3.6
Summary: The upgrade procedure fails if a 4.0 is added to a HE 3.5 cluster just partia...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: General
Version: 2.0.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ovirt-4.0.6
: 2.0.6
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: 1396997
TreeView+ depends on / blocked
 
Reported: 2016-11-22 20:44 UTC by Simone Tiraboschi
Modified: 2017-01-18 07:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-18 07:26:04 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.0.z+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 67189 0 master MERGED upgrade: fix the upgrade procedure for jsonrpc 2016-11-23 12:47:16 UTC
oVirt gerrit 67190 0 v2.0.z MERGED upgrade: fix the upgrade procedure for jsonrpc 2016-11-23 12:47:44 UTC

Description Simone Tiraboschi 2016-11-22 20:44:04 UTC
Description of problem:
The upgrade procedure fails if a 4.0 is added to a HE 3.5 cluster just partially upgraded to 3.6.
ovirt-ha-agent from 4.0 will still try the upgrade but it will fail with:
MainThread::INFO::2016-11-22 19:57:42,391::upgrade::1021::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36) Upgrading to current version
MainThread::INFO::2016-11-22 19:57:42,398::upgrade::743::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_stopMonitoringDomain) Stop monitoring domain
MainThread::ERROR::2016-11-22 19:57:42,401::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Failed stopping monitoring domain: Attempt to call function: <bound method Global.stopMonitoringDomain of <API.Global object at 0x4059410>> with arguments: () error: stopMonitoringDomain() takes exactly 2 arguments (1 given)' - trying to restart agent
MainThread::WARNING::2016-11-22 19:57:47,407::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, attempt '0'

since the jsonRPC APIs used in 4.0 are slightly different.


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:
19:57:42,401::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Failed stopping monitoring domain: Attempt to call function: <bound method Global.stopMonitoringDomain of <API.Global object at 0x4059410>> with arguments: () error: stopMonitoringDomain() takes exactly 2 arguments (1 given)' - trying to restart agent

Expected results:
MainThread::INFO::2016-11-22 20:33:25,761::upgrade::1056::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36) Successfully upgraded

Additional info:

Comment 1 Nikolai Sednev 2016-12-13 16:02:14 UTC
Please provide reproduction steps for this bug.

Comment 2 Simone Tiraboschi 2016-12-13 16:48:07 UTC
The simplest reproduction path (upstream only) is a direct upgrade of the hosted-engine host from 3.5 on el7 to 4.0 on el7.

Comment 3 Nikolai Sednev 2016-12-14 07:52:31 UTC
This is too generic and not much detailed explanation. I see from the topic "The upgrade procedure fails if a 4.0 is added to a HE 3.5 cluster just partially upgraded to 3.6" and need to understand in more details:
1)What 4.0 is added, host?
2)Assuming 4.0 host being added to host cluster, which is still in 3.5's compatibility mode?
3)Which part of 3.5 is upgraded to to 3.6?

There are too many open questions here and no detailed description of what was tested and what failed.

Comment 4 Nikolai Sednev 2017-01-05 18:10:34 UTC
-------------------------3.5->3.6-----------------------------------------------
1-Deployed environment on 2 3.5 RHEVH6.8 over NFS with 2 NFS data storage domains.
2-Upgraded engine and one host from 3.5 to 3.6.10.
3-Upgraded second host to 3.6RHEVH7.2.
4-Added 4.0 NGN as additional hosted engine host to host cluster, while still in 3.5.
5-Bumped up host cluster to 3.6 and saw that hosted_storage auto import started and finished successfully.
6-Bumped up data center to 3.6.
-------------------------3.6->4.0.6----------------------------------------------
7-Migrated HE-VM to 4.0 NGN.
8-Installed appliance on NGN.
9-Set global maintenance.
10-Upgraded the engine using hosted-engine --upgrade-appliance.
11-Activated back from global maintenance.
12-Upgraded all remaining hosts to 4.0.6.
13-Bumped up host cluster to 4.0.6.
14-Bumped up data center to 4.0.5.5-0.1 as received from rhevm-appliance-20161116.0-1.el7ev.noarch.

Moving to verified.


Note You need to log in before you can comment on or make changes to this bug.