Description of problem: Put/remove host to/from local maintenance not working in Cockpit on NGN. Version-Release number of selected component (if applicable): sanlock-3.2.4-2.el7_2.x86_64 ovirt-hosted-engine-ha-2.0.1-1.el7ev.noarch ovirt-imageio-daemon-0.3.0-0.el7ev.noarch ovirt-host-deploy-1.5.1-1.el7ev.noarch ovirt-engine-sdk-python-3.6.7.0-1.el7ev.noarch qemu-kvm-rhev-2.3.0-31.el7_2.16.x86_64 mom-0.5.5-1.el7ev.noarch ovirt-setup-lib-1.0.2-1.el7ev.noarch ovirt-vmconsole-host-1.0.4-1.el7ev.noarch libvirt-client-1.2.17-13.el7_2.5.x86_64 vdsm-4.18.6-1.el7ev.x86_64 ovirt-hosted-engine-setup-2.0.1-1.el7ev.noarch ovirt-imageio-common-0.3.0-0.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch Linux version 3.10.0-327.22.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Thu Jun 9 10:09:10 EDT 2016 Linux 3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 9 10:09:10 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 7.2 How reproducible: 100% Steps to Reproduce: 1.Deploy hosted engine on pair of RHEVH NGN over NFS. 2.Add data storage domain to get HE storage domain imported. 3.Try placing one of the hosts in to local maintenance via Cockpit (Virtualization->Hosted Engine->Put this host into local maintenance). 4.Put host via WEBUI of the engine in to maintenance. 5.Try re-activating host via Cockpit by Virtualization->Hosted Engine->Remove this host from maintenance. Actual results: Put/remove host to/from local maintenance not working in Cockpit on NGN. Expected results: Both options should work properly. Additional info: Screenshots supported.
Created attachment 1183325 [details] Screenshot from 2016-07-24 13:59:02.png
I was upgrading my lab to 4.0 (finally) today when this came in, and I tested it -- I can't reproduce. We're calling hosted-engine directly. It seems that global maintenance triggers almost immediately, but local has some lag time (even using a shell on the host). How long have you waited? 15-30 seconds seems to be about the average.
Also, I can add a flag which shows some kind of visual indicator that a stage update has been triggered. The difficulty here is that "hosted-engine --set-maintenance --mode=local" returns immediately, but it takes some time to update. We'd need to figure out a reasonable timeout after which a spinner could be replaced with a warning icon (because it did not change)
(In reply to Ryan Barry from comment #2) > I was upgrading my lab to 4.0 (finally) today when this came in, and I > tested it -- I can't reproduce. > > We're calling hosted-engine directly. It seems that global maintenance > triggers almost immediately, but local has some lag time (even using a shell > on the host). How long have you waited? 15-30 seconds seems to be about the > average. Waited about 1-3 minutes and host alma03 was not set in to local maintenance within the WEBUI, but it was in CLI: [root@alma04 ~]# hosted-engine --vm-status --== Host 1 status ==-- Status up-to-date : True Hostname : alma03.qa.lab.tlv.redhat.com Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 0 stopped : False Local maintenance : True crc32 : 945c00aa Host timestamp : 345027 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=345027 (Mon Jul 25 13:00:44 2016) host-id=1 score=0 maintenance=True state=LocalMaintenance stopped=False --== Host 2 status ==-- Status up-to-date : True Hostname : alma04.qa.lab.tlv.redhat.com Host ID : 2 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : cdf2fe5d Host timestamp : 80794 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=80794 (Mon Jul 25 13:00:38 2016) host-id=2 score=3400 maintenance=False state=EngineUp stopped=False When I tried to remove it from local maintenance using Cockpit, it host alma03 was removed from it successfully. So issue here is that engine's WEBUI not getting any changes from hosts at all when using Cockpit Virtualization->Hosted Engine->Put/Remove this host into/from local maintenance, although if doing the same via Cockpit Virtualization->Virtual Machines->Host to Maintenance, then it's working in both CLI and WEBUI of the engine. I suspect that this is due to the fact, that I'm logged in from Cockpit to the engine's WEBUI and in latest flow it also updates the engine's DB, whereas if first example it's not doing so. (In reply to Ryan Barry from comment #3) > Also, I can add a flag which shows some kind of visual indicator that a > stage update has been triggered. > > The difficulty here is that "hosted-engine --set-maintenance --mode=local" > returns immediately, but it takes some time to update. We'd need to figure > out a reasonable timeout after which a spinner could be replaced with a > warning icon (because it did not change) I see that command being transferred in ctrl+shift+i in console of the WEB, but not sure it has any affect on my environment as for example moving to maintenance via Virtualization->Virtual Machines->Host to Maintenance takes affect almost immediately. Speaking of latest, IMHO we don't need to have the same option duplicated in more than one place. If possible, I'd like to manage hosts from one place.
Created attachment 1183672 [details] screenshot from the engine's WEBUI of alma03 not in maintenance, while it is in CLI.
(In reply to Nikolai Sednev from comment #4) > Waited about 1-3 minutes and host alma03 was not set in to local maintenance > within the WEBUI, but it was in CLI: Ok, so this is unclear, I suppose. Two questions: First -- Is the correct value reflected in cockpit? Second -- I haven't actually done much with Engine/WEBUI in 4.0. hosted-engine maintenance does not set vdsm maintenance. I'm not sure if there's an indicator in the webui about hosted engine maintenance. Probably yes. > When I tried to remove it from local maintenance using Cockpit, it host > alma03 was removed from it successfully. So issue here is that engine's > WEBUI not getting any changes from hosts at all when using Cockpit > Virtualization->Hosted Engine->Put/Remove this host into/from local > maintenance, although if doing the same via Cockpit Virtualization->Virtual > Machines->Host to Maintenance, then it's working in both CLI and WEBUI of > the engine. I suspect that this is due to the fact, that I'm logged in from > Cockpit to the engine's WEBUI and in latest flow it also updates the > engine's DB, whereas if first example it's not doing so. Is that status updated when using "hosted-engine --set-maintenance --mode=local" from the CLI? > I see that command being transferred in ctrl+shift+i in console of the WEB, > but not sure it has any affect on my environment as for example moving to > maintenance via Virtualization->Virtual Machines->Host to Maintenance takes > affect almost immediately. I believe that this sets VDSM maintenance. > > Speaking of latest, IMHO we don't need to have the same option duplicated in > more than one place. If possible, I'd like to manage hosts from one place. That's an ongoing discussion, I think. At present, the goal of the cockpit plugin is to provide a way to manage the functionality which could normally be reached over the shell/TUI of a single host, which has some overlap with engine, but the scope is limited -- engine manages clusters/datacenters. Cockpit manages one host.
I see these functionalities are working with a bit of delay now, I've probably looked at them and thought they would make changes on the fly, but they're a bit delayed and also not functioning the same way at all. In Cockpit "Virtualization->Hosted Engine->Put this host into local maintenance", then after some time (less than a minute) host's status returns to "Local maintenance: True" in CLI, "Local Maintenance: true" in Cockpit, but not changes it's symbol active symbol in engine's WEBUI to wrench symbol, although it's status shown as "Hosted Engine HA:Local Maintenance Enabled". If setting host via Cockpit into local maintenance via "Virtualization->Virtual Machines->Host to Maintenance", then it changing it's status everywhere properly, in Cockpit "Local Maintenance: true", in CLI "Local maintenance: True" and in engine's WEBUI "Hosted Engine HA: Local Maintenance Enabled" and also with a symbol of a wrench. If then trying to activate the host back via Cockpit by "Virtualization->Hosted Engine->Remove this host from maintenance", then after some time (less than a minute) host's status returns to active in CLI, Cockpit "Local Maintenance:false", but not being synchronized with engine's WEBUI in which it stays in "Hosted Engine HA:Local Maintenance Enabled" with a symbol of a wrench. If setting host in to local maintenance from CLI, e.g. "hosted-engine --set-maintenance --mode=local", then in CLI host's status shown correctly as "Local maintenance: True", in Cockpit it's status also shown correctly as "Local Maintenance: true", but in engine's WEBUI it's status partially correct as it appears without wrench symbol, but in correct status of "Hosted Engine HA:Local Maintenance Enabled". I see inconsistency of how host's status being shown in engnine's WEBUI between: 1)From Cockpit "Virtualization->Virtual Machines->Host to Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in local maintenance. 2)From Cockpit "Virtualization->Hosted Engine->Put this host into local maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, but in local maintenance. If step 1 was done, then in Cockpit "Virtualization->Hosted Engine->Remove this host from maintenance", then engine's WEBUI not being synchronized with changes these changes at all and host appears in engine's WEBUI in local maintenance with a wrench symbol and "Hosted Engine HA: Local Maintenance Enabled" status. CLI's "hosted-engine --set-maintenance --mode=local" equals to Cockpit's "Virtualization->Hosted Engine->Put this host into local maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, but in local maintenance. CLI's "hosted-engine --set-maintenance --mode=local" is not the same as Cockpit's "Virtualization->Virtual Machines->Host to Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in local maintenance.
(In reply to Nikolai Sednev from comment #7) > I see these functionalities are working with a bit of delay now, I've > probably looked at them and thought they would make changes on the fly, but > they're a bit delayed and also not functioning the same way at all. > > > In Cockpit "Virtualization->Hosted Engine->Put this host into local > maintenance", then after some time (less than a minute) host's status > returns to "Local maintenance: True" in CLI, "Local Maintenance: true" in > Cockpit, but not changes it's symbol active symbol in engine's WEBUI to > wrench symbol, although it's status shown as "Hosted Engine HA:Local > Maintenance Enabled". > > If setting host via Cockpit into local maintenance via > "Virtualization->Virtual Machines->Host to Maintenance", then it changing > it's status everywhere properly, in Cockpit "Local Maintenance: true", in > CLI "Local maintenance: True" and in engine's WEBUI "Hosted Engine HA: Local > Maintenance Enabled" and also with a symbol of a wrench. This is expected -- VDSM maintenance also sets hosted-engine maintenance. > > If then trying to activate the host back via Cockpit by > "Virtualization->Hosted Engine->Remove this host from maintenance", then > after some time (less than a minute) host's status returns to active in CLI, > Cockpit "Local Maintenance:false", but not being synchronized with engine's > WEBUI in which it stays in "Hosted Engine HA:Local Maintenance Enabled" with > a symbol of a wrench. I think the question here is how engine polls/communicates with hosted-engine. I imagine that it connects to ovirt-ha-agent, but I don't know on what intervals. Simone? > > > If setting host in to local maintenance from CLI, e.g. "hosted-engine > --set-maintenance --mode=local", then in CLI host's status shown correctly > as "Local maintenance: True", in Cockpit it's status also shown correctly as > "Local Maintenance: true", but in engine's WEBUI it's status partially > correct as it appears without wrench symbol, but in correct status of > "Hosted Engine HA:Local Maintenance Enabled". Is this partially correct? This seems entirely correct. If we expect engines WEBUI to show a wrench for hosted-engine maintenance (which does not set VDSM maintenance), a separate bug should be filed. > I see inconsistency of how host's status being shown in engnine's WEBUI > between: > 1)From Cockpit "Virtualization->Virtual Machines->Host to > Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in > local maintenance. > 2)From Cockpit "Virtualization->Hosted Engine->Put this host into local > maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, > but in local maintenance. See above -- VDSM maintenance and hosted-engine maintenance are not the same. > > If step 1 was done, then in Cockpit "Virtualization->Hosted Engine->Remove > this host from maintenance", then engine's WEBUI not being synchronized with > changes these changes at all and host appears in engine's WEBUI in local > maintenance with a wrench symbol and "Hosted Engine HA: Local Maintenance > Enabled" status. Is it possible to remove a host from VDSM maintenance through hosted-engine? I don't think so... "Hosted Engine->Remove this host from maintenance" calls "hosted-engine --set-maintenance --mode=none". > CLI's "hosted-engine --set-maintenance --mode=local" equals to Cockpit's > "Virtualization->Hosted Engine->Put this host into local > maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, > but in local maintenance. > > CLI's "hosted-engine --set-maintenance --mode=local" is not the same as > Cockpit's "Virtualization->Virtual Machines->Host to Maintenance"====>Shown > in engine's WEBUI with symbol of a wrench and in local maintenance. So, it's clear that the terminology here is confusing, since hosted-engine and VDSM both mean different things when they refer to "Maintenance". Any suggestions here?
(In reply to Ryan Barry from comment #8) > (In reply to Nikolai Sednev from comment #7) > > I see these functionalities are working with a bit of delay now, I've > > probably looked at them and thought they would make changes on the fly, but > > they're a bit delayed and also not functioning the same way at all. > > > > > > In Cockpit "Virtualization->Hosted Engine->Put this host into local > > maintenance", then after some time (less than a minute) host's status > > returns to "Local maintenance: True" in CLI, "Local Maintenance: true" in > > Cockpit, but not changes it's symbol active symbol in engine's WEBUI to > > wrench symbol, although it's status shown as "Hosted Engine HA:Local > > Maintenance Enabled". > > > > If setting host via Cockpit into local maintenance via > > "Virtualization->Virtual Machines->Host to Maintenance", then it changing > > it's status everywhere properly, in Cockpit "Local Maintenance: true", in > > CLI "Local maintenance: True" and in engine's WEBUI "Hosted Engine HA: Local > > Maintenance Enabled" and also with a symbol of a wrench. > > This is expected -- VDSM maintenance also sets hosted-engine maintenance. We have an open bug about the same flow in the opposite direction: https://bugzilla.redhat.com/show_bug.cgi?id=1353600 > > If then trying to activate the host back via Cockpit by > > "Virtualization->Hosted Engine->Remove this host from maintenance", then > > after some time (less than a minute) host's status returns to active in CLI, > > Cockpit "Local Maintenance:false", but not being synchronized with engine's > > WEBUI in which it stays in "Hosted Engine HA:Local Maintenance Enabled" with > > a symbol of a wrench. > > I think the question here is how engine polls/communicates with > hosted-engine. I imagine that it connects to ovirt-ha-agent, but I don't > know on what intervals. > > Simone? The engine simply talks with VDSM as usually, on HE hosts VDSM also knows the hosted-engine HA status from the ha agent. > > If setting host in to local maintenance from CLI, e.g. "hosted-engine > > --set-maintenance --mode=local", then in CLI host's status shown correctly > > as "Local maintenance: True", in Cockpit it's status also shown correctly as > > "Local Maintenance: true", but in engine's WEBUI it's status partially > > correct as it appears without wrench symbol, but in correct status of > > "Hosted Engine HA:Local Maintenance Enabled". > > Is this partially correct? This seems entirely correct. If we expect engines > WEBUI to show a wrench for hosted-engine maintenance (which does not set > VDSM maintenance), a separate bug should be filed. > > > > I see inconsistency of how host's status being shown in engnine's WEBUI > > between: > > 1)From Cockpit "Virtualization->Virtual Machines->Host to > > Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in > > local maintenance. > > 2)From Cockpit "Virtualization->Hosted Engine->Put this host into local > > maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, > > but in local maintenance. > > See above -- VDSM maintenance and hosted-engine maintenance are not the same. > > > > > If step 1 was done, then in Cockpit "Virtualization->Hosted Engine->Remove > > this host from maintenance", then engine's WEBUI not being synchronized with > > changes these changes at all and host appears in engine's WEBUI in local > > maintenance with a wrench symbol and "Hosted Engine HA: Local Maintenance > > Enabled" status. > > Is it possible to remove a host from VDSM maintenance through hosted-engine? > I don't think so... "Hosted Engine->Remove this host from maintenance" calls > "hosted-engine --set-maintenance --mode=none". > > > CLI's "hosted-engine --set-maintenance --mode=local" equals to Cockpit's > > "Virtualization->Hosted Engine->Put this host into local > > maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, > > but in local maintenance. > > > > CLI's "hosted-engine --set-maintenance --mode=local" is not the same as > > Cockpit's "Virtualization->Virtual Machines->Host to Maintenance"====>Shown > > in engine's WEBUI with symbol of a wrench and in local maintenance. > > So, it's clear that the terminology here is confusing, since hosted-engine > and VDSM both mean different things when they refer to "Maintenance". Any > suggestions here? I'd suggest do call it hosted-engine local maintenance; then we also have the hosted-engine global maintenance mode
Moving this to hosted-engine setup, as this is more about the right names for the maintenance modes. The cockpit UI will follow the names teh he-setup suggests, thus once those names are updated, the cockpit UI will follow.
(In reply to Fabian Deutsch from comment #10) > Moving this to hosted-engine setup, as this is more about the right names > for the maintenance modes. > > The cockpit UI will follow the names teh he-setup suggests, thus once those > names are updated, the cockpit UI will follow. Can you please split this bug in 2, one on HE and one on oVirt Cockpit?
Moving to Roy/SLA: maintenance modes are SLA domain.
This is about semantics of maintenance. Currently working as designed.
What is the action item on docs?
(In reply to Yaniv Dary from comment #14) > What is the action item on docs? Properly describe each maintenance mode and the interactions between them. The initial explanation is available in [1] and we should add standard host maintenance (unrelated to HE). [1] https://www.ovirt.org/documentation/how-to/hosted-engine/#maintaining-the-setup
This bug had requires_doc_text flag, yet no documentation text was provided. Please add the documentation text and only then set this flag.
Hi Nikolai, To which versions of the product does this request apply? Can you please set the correct version in 'Version' field?
AFAIK it was 4.0.
please add how to switch back to None mode from Local and Global modes on the GUI as well.
Now published for beta due to the delayed GA: https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3-beta/html/administration_guide/chap-administering_the_self-hosted_engine#Maintaining_the_Self-Hosted_Engine