Created attachment 1239018 [details] info_cockpit.png Description of problem: There is no information about additional host on cockpit after add additional host via engine web UI. And can't figure out the number of hosts in the cluster. Version-Release number of selected component (if applicable): redhat-virtualization-host-4.1-0.20170104.0 cockpit-ovirt-dashboard-0.10.7-0.0.3.el7ev.noarch cockpit-ws-126-1.el7.x86_64 imgbased-0.9.2-0.1.el7ev.noarch 20170103.0-1.el7ev.4.0.rpm (ovirt-engine-appliance rpm) ovirt-hosted-engine-ha-2.1.0-0.0.master.git118000f.el7ev.noarch ovirt-host-deploy-1.6.0-0.2.master.gitb76ad50.el7ev.noarch ovirt-hosted-engine-setup-2.1.0-0.0.master.git46cacd3.el7ev.noarch How reproducible: 100% Not Regression bug Keywords: UI Steps to Reproduce: 1. Install RHVH 4.1 2. Login cockpit website hostIP:9090 with root account 3. Ensure engine appliance pre-install 4. Deploy HE via cockpit step by step 5.Add the additional host via engine web UI. Actual results: After step 5, there is no information about additional host on cockpit after add additional host via engine web UI. Expected results: After step 5, there should be some information about additional host on cockpit after add additional host via engine web UI. Additional info:
Created attachment 1239019 [details] engine_ui.png
Created attachment 1239020 [details] additional_host_cockpit_ui.png
Created attachment 1239023 [details] sosreport,/var/log/*
Created attachment 1239024 [details] HE_VM_engine.log
What is the output of hosted-engine --vm-status?
(In reply to Ryan Barry from comment #5) > What is the output of hosted-engine --vm-status? Just "Host 1 status". Info: [root@dell-per730-34 ~]# hosted-engine --vm-status --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : dell-per730-34.lab.eng.pek2.redhat.com Host ID : 1 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : a6075095 local_conf_timestamp : 1835 Host timestamp : 1824 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1824 (Wed Jan 11 12:49:21 2017) host-id=1 score=3400 vm_conf_refresh_time=1835 (Wed Jan 11 12:49:32 2017) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False [root@dell-per730-34 ~]#
Simone, any ideas on how this is supposed to work?
From the log it seams that the autoimport process for the hosted-engine storage domain wasn't completed when Yihui started adding the second host. 2017-01-09 06:47:01,998 WARN [org.ovirt.engine.core.bll.storage.domain.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-6-thread-15) [728a7e3d] Validation of action 'ImportHostedEngineStorageDomain' failed for user SYSTEM. Reasons: VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_MASTER_STORAGE_DOMAIN_NOT_ACTIVE 2017-01-09 06:47:01,998 INFO [org.ovirt.engine.core.bll.storage.domain.ImportHostedEngineStorageDomainCommand] (org.ovirt.thread.pool-6-thread-15) [728a7e3d] Lock freed to object 'EngineLock:{exclusiveLocks='[d0a0306c-276f-480b-88a2-782131ffda5f=<STORAGE, ACTION_TYPE_FAILED_STORAGE_DEVICE_LOCKED>]', sharedLocks='null'}' 2017-01-09 06:47:02,187 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [26ea599a] Correlation ID: 26ea599a, Call Stack: null, Custom Event ID: -1, Message: Installing Host yzhao. Enrolling certificate. 2017-01-09 06:47:03,279 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [26ea599a] Correlation ID: 26ea599a, Call Stack: null, Custom Event ID: -1, Message: Installing Host yzhao. Enrolling The flow is: 1. deploy hosted-engine with hosted-engine-setup on the first host 2. add the first regular storage domain and wait for the datacenter to go up 3. wait for the hosted-engine storage domain and for the engine VM to be successfully imported 4. deploy the additional host explicitly choosing to deploy it as an hosted-engine host hosted-engine --vm-status will show also the status of that host.
Not really related, but also this looks a bit strange: 2017-01-09 19:38:33 DEBUG otopi.context context.dumpEnvironment:770 ENV OVEHOSTED_ENGINE/appHostName=str:'localhost.localdomain' Yihui, could you please execute on your first host python -c 'import socket; print(socket.gethostname())' and share your output here?
(In reply to Simone Tiraboschi from comment #9) > Not really related, but also this looks a bit strange: > 2017-01-09 19:38:33 DEBUG otopi.context context.dumpEnvironment:770 ENV > OVEHOSTED_ENGINE/appHostName=str:'localhost.localdomain' > > Yihui, could you please execute on your first host > python -c 'import socket; print(socket.gethostname())' > and share your output here? The output: localhost.localdomain
(In reply to Yihui Zhao from comment #10) > The output: > > localhost.localdomain Pretty bad, did you set an hostname for it from cockpit? could you please share your /etc/hosts file from that host?
(In reply to Simone Tiraboschi from comment #11) > (In reply to Yihui Zhao from comment #10) > > The output: > > > > localhost.localdomain > > Pretty bad, did you set an hostname for it from cockpit? > could you please share your /etc/hosts file from that host? Hi,Simon I retry to deploy HE. Also can met this issues. Some info if needed: 1. First host: [root@dell-per730-34 ~]# hosted-engine --vm-status --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : dell-per730-34.lab.eng.pek2.redhat.com Host ID : 1 Engine status : {"health": "good", "vm": "up", "detail": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : 3a3adbf2 local_conf_timestamp : 5164 Host timestamp : 5152 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=5152 (Thu Jan 12 11:25:53 2017) host-id=1 score=3400 vm_conf_refresh_time=5164 (Thu Jan 12 11:26:04 2017) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False [root@dell-per730-34 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.73.131.132 a.redhat.com 10.73.131.63 dell-per730-34.lab.eng.pek2.redhat.com 10.73.130.225 ibm-x3650m5-04.lab.eng.pek2.redhat.com [root@dell-per730-34 ~]# hostname dell-per730-34.lab.eng.pek2.redhat.com [root@dell-per730-34 ~]# cat /etc/hostname dell-per730-34.lab.eng.pek2.redhat.com [root@dell-per730-34 ~]# python -c 'import socket; print(socket.gethostname())' dell-per730-34.lab.eng.pek2.redhat.com [root@dell-per730-34 ~]# 2. Additional host: [root@ibm-x3650m5-04 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.73.131.132 a.redhat.com 10.73.131.63 dell-per730-34.lab.eng.pek2.redhat.com 10.73.130.225 ibm-x3650m5-04.lab.eng.pek2.redhat.com [root@ibm-x3650m5-04 ~]# hostname ibm-x3650m5-04.lab.eng.pek2.redhat.com [root@ibm-x3650m5-04 ~]# cat /etc/hostname ibm-x3650m5-04.lab.eng.pek2.redhat.com [root@ibm-x3650m5-04 ~]# python -c 'import socket; print(socket.gethostname())' ibm-x3650m5-04.lab.eng.pek2.redhat.com 3.HE-VM [root@a ~]# hostname a.redhat.com [root@a ~]# cat /etc/hostname a.redhat.com [root@a ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.73.131.132 a.redhat.com 10.73.131.63 dell-per730-34.lab.eng.pek2.redhat.com 10.73.130.225 ibm-x3650m5-04.lab.eng.pek2.redhat.com
This is not reproducible upstream on Centos.
Created attachment 1240010 [details] Deploy hosted engine host
Yihui, did you explicitly choose to deploy your second host as an hosted-engine host as in this picture https://bugzilla.redhat.com/attachment.cgi?id=1240010 ?
(In reply to Simone Tiraboschi from comment #15) > Yihui, > did you explicitly choose to deploy your second host as an hosted-engine > host as in this picture > https://bugzilla.redhat.com/attachment.cgi?id=1240010 > ? Thanks for your help. I will try your steps again.And don't reproduce the issue. Thanks, Simon
According to the comment 16, so close the bug.