Description of problem: The upgrade procedure fails if a 4.0 is added to a HE 3.5 cluster just partially upgraded to 3.6. ovirt-ha-agent from 4.0 will still try the upgrade but it will fail with: MainThread::INFO::2016-11-22 19:57:42,391::upgrade::1021::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36) Upgrading to current version MainThread::INFO::2016-11-22 19:57:42,398::upgrade::743::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_stopMonitoringDomain) Stop monitoring domain MainThread::ERROR::2016-11-22 19:57:42,401::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Failed stopping monitoring domain: Attempt to call function: <bound method Global.stopMonitoringDomain of <API.Global object at 0x4059410>> with arguments: () error: stopMonitoringDomain() takes exactly 2 arguments (1 given)' - trying to restart agent MainThread::WARNING::2016-11-22 19:57:47,407::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, attempt '0' since the jsonRPC APIs used in 4.0 are slightly different. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. 2. 3. Actual results: 19:57:42,401::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Failed stopping monitoring domain: Attempt to call function: <bound method Global.stopMonitoringDomain of <API.Global object at 0x4059410>> with arguments: () error: stopMonitoringDomain() takes exactly 2 arguments (1 given)' - trying to restart agent Expected results: MainThread::INFO::2016-11-22 20:33:25,761::upgrade::1056::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36) Successfully upgraded Additional info:
Please provide reproduction steps for this bug.
The simplest reproduction path (upstream only) is a direct upgrade of the hosted-engine host from 3.5 on el7 to 4.0 on el7.
This is too generic and not much detailed explanation. I see from the topic "The upgrade procedure fails if a 4.0 is added to a HE 3.5 cluster just partially upgraded to 3.6" and need to understand in more details: 1)What 4.0 is added, host? 2)Assuming 4.0 host being added to host cluster, which is still in 3.5's compatibility mode? 3)Which part of 3.5 is upgraded to to 3.6? There are too many open questions here and no detailed description of what was tested and what failed.
-------------------------3.5->3.6----------------------------------------------- 1-Deployed environment on 2 3.5 RHEVH6.8 over NFS with 2 NFS data storage domains. 2-Upgraded engine and one host from 3.5 to 3.6.10. 3-Upgraded second host to 3.6RHEVH7.2. 4-Added 4.0 NGN as additional hosted engine host to host cluster, while still in 3.5. 5-Bumped up host cluster to 3.6 and saw that hosted_storage auto import started and finished successfully. 6-Bumped up data center to 3.6. -------------------------3.6->4.0.6---------------------------------------------- 7-Migrated HE-VM to 4.0 NGN. 8-Installed appliance on NGN. 9-Set global maintenance. 10-Upgraded the engine using hosted-engine --upgrade-appliance. 11-Activated back from global maintenance. 12-Upgraded all remaining hosts to 4.0.6. 13-Bumped up host cluster to 4.0.6. 14-Bumped up data center to 4.0.5.5-0.1 as received from rhevm-appliance-20161116.0-1.el7ev.noarch. Moving to verified.