Description of problem: HE host is not taken out of Local Maintenance after Reinstall. Which is incorrect, since it was put into HE Local maintenance when we enabled regular maintenance for it in UI. So once it is activated back, HE local maintenance should be canceled as well. Version-Release number of selected component (if applicable): 4.1.4 Steps to Reproduce: 1. Put host in maintenance. 2. Select Reinstall option in UI and wait till reinstall is performed and the host is active back in UI. Actual results: Host is still in local HE maintenance and requires manual intervention from the command line to disable the HE maintenance. Expected results: Host should be fully operational once it is activated. Or, if impossible, we should at least provide a UI option to disable local HE maintenance.
I think, it actually should be high. The end user would expect the host to be out of HE maintenance. And if it does not go back out of maintenance automatically, without informing the user, it is not right flow.
*** Bug 1501016 has been marked as a duplicate of this bug. ***
I am unable to reproduce this on master and 4.2 trying 4.1
Works in ovirt-engine-backend-4.1.9.1-1.el7.centos.noarch too. Tried with both node-ng and vdsm-4.19.45-1.el7.centos.x86_64 on cent os 7. The host is activated after reinstall. Please check with latest 4.1 build
Created attachment 1388412 [details] Activate hosts python script
(In reply to Ravi Nori from comment #13) > Works in ovirt-engine-backend-4.1.9.1-1.el7.centos.noarch too. > > Tried with both node-ng and vdsm-4.19.45-1.el7.centos.x86_64 on cent os 7. > The host is activated after reinstall. > > Please check with latest 4.1 build Nori, if you tested it on 4.1.9 and it didn't reproduce for you, i.e. after reinstall HE Local maintenance was disabled on the host - let's close it as if it works in 4.1.9.
Nikolai, maybe you can help Nori verifying this bug? Thank you!
That is fixed in the latest version of 4.2 beta
The operation works just fine on 4.2.1.5-0.1.el7. rhvm-appliance-4.2-20180202.0.el7.noarch ovirt-hosted-engine-ha-2.2.4-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.9-1.el7ev.noarch Linux 3.10.0-693.17.1.el7.x86_64 #1 SMP Sun Jan 14 10:36:03 EST 2018 x86_64 x86_64 x86_64 GNU/Linux
Reopening to backport fix to 4.1.10.
(In reply to Yaniv Lavi from comment #20) > Reopening to backport fix to 4.1.10. What do you want to backport? According to Comment 13 it works fine in 4.1.9
Added to 4.1.10 errata and moving to ON_QA. Nikolai, could you please verify, that every flow works as expected in 4.1.10 and we haven't missed anything?
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [No relevant external trackers attached] For more info please contact: rhv-devops
(In reply to Martin Perina from comment #22) > Added to 4.1.10 errata and moving to ON_QA. Nikolai, could you please > verify, that every flow works as expected in 4.1.10 and we haven't missed > anything? Could you please define required flows?
Original issue is still being reproduced on latest 4.1.10.1-0.1.el7 Reproduction steps: 1.Deployed rhevm-4.1.9.1-0.1.el7.noarch on pair of 4.1.9 ha-hosts, engine was running on RHEL7.4, hosts on RHEL7.5. 2.Set global maintenance via UI. 3."yum update -y ovirt-engine-setup" to rhevm-4.1.10.1-0.1.el7.noarch. 4.Upgraded the engine to rhevm-4.1.10.1-0.1.el7.noarch using "engine-setup". 5."yum update -y" on engine to get RHEL7.4 updated to RHEL7.5. 6.Rebooted the engine from the engine. 7.Started engine from host using "hosted-engine --vm-start". 8.Removed global maintenance from ha-hosts. 9.Logged in to the engine's UI and set one of two hosts *alma03, the first host that was not hosting SHE-VM and it was not SPM) in to maintenance and then reinstalled it, after reinstall, host recovered and got automatically activated. 10.Reinstalled ha-host became in local maintenance in CLI, and in UI it was appeared as "Unavailable due to HA score". See result in CLI: alma03 ~]# hosted-engine --vm-status --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : alma03 Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 0 stopped : False Local maintenance : True crc32 : bb19601a local_conf_timestamp : 9806 Host timestamp : 9804 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=9804 (Sun Feb 25 19:03:13 2018) host-id=1 score=0 vm_conf_refresh_time=9806 (Sun Feb 25 19:03:15 2018) conf_on_shared_storage=True maintenance=True state=LocalMaintenance stopped=False --== Host 2 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : alma04 Host ID : 2 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : 915c08da local_conf_timestamp : 9769 Host timestamp : 9767 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=9767 (Sun Feb 25 19:03:19 2018) host-id=2 score=3400 vm_conf_refresh_time=9769 (Sun Feb 25 19:03:21 2018) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False Print screen from UI and sosreports from both hosts and the engine are attached. Moving back to assigned.
Created attachment 1400614 [details] Screenshot from 2018-02-25 19-07-06.png
Created attachment 1400615 [details] engine logs
Created attachment 1400616 [details] alma03 in local maintenance
Created attachment 1400617 [details] alma04 logs
To enable alma03, I manually hade to cast "hosted-engine --set-maintenance --mode=none" from CLI. See also attached screencast.
Created attachment 1400618 [details] screencast
*** Bug 1536286 has been marked as a duplicate of this bug. ***
Ravi, are you looking into this?
I was able to reproduce the issue on 4.1.9. The patch https://gerrit.ovirt.org/#/c/86645/ for BZ 1532709 fixes the issue and has not been merged.
This is not going to make it to 4.1.10 - please re-target.
Moving to MODIFIED as fix for BZ1532709 fixes also this issue
Works for me on these components: rhvm-appliance-4.2-20180420.0.el7.noarch ovirt-hosted-engine-setup-2.2.18-1.el7ev.noarch ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch ovirt-engine-setup-4.2.3.2-0.1.el7.noarch Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira re-sync
BZ<2>Jira Resync
sync2jira