Bug 1487915 - Deployment of SHE fails with 'NoneType' object has no attribute 'values'.
Summary: Deployment of SHE fails with 'NoneType' object has no attribute 'values'.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: Plugins.General
Version: ---
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.2.0
: 2.2.0
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1488333
Blocks: 1429855 1434209 1457468 1479776
TreeView+ depends on / blocked
 
Reported: 2017-09-03 09:40 UTC by Nikolai Sednev
Modified: 2017-12-20 11:43 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-12-20 11:43:23 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+
rule-engine: planning_ack+
sbonazzo: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
Sosreport and setup log from alma03 (9.62 MB, application/x-gzip)
2017-09-03 09:40 UTC, Nikolai Sednev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 81405 0 master POST ha: loop till a timeout connecting the broker 2017-09-04 13:36:38 UTC
oVirt gerrit 81608 0 master POST storage: VdsmBackend should not use Config directly 2017-09-11 08:36:48 UTC
oVirt gerrit 81609 0 master POST agent: Agent should always start domain monitoring. 2017-09-11 08:36:56 UTC

Description Nikolai Sednev 2017-09-03 09:40:20 UTC
Created attachment 1321473 [details]
Sosreport and setup log from alma03

Description of problem:
[ ERROR ] Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'values'
[ ERROR ] Failed to execute stage 'Clean up': 
[ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue,fix and redeploy

20**FILTERED**7-09-03 **FILTERED**2:00:34,635+0300 DEBUG otopi.context context._executeMethod:**FILTERED**42 method ex
ception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line **FILTERED**32, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/storage/storage.py", line **FILTERED**044,
 in _cleanup
    self._stopMonitoringDomain()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/storage/storage.py", line 70**FILTERED**, 
in _stopMonitoringDomain
    self.environment[ohostedcons.StorageEnv.HOST_ID]
  File "/usr/lib/python2.7/site-packages/vdsm/jsonrpcvdscli.py", line **FILTERED**70, in _callMethod
    req, timeout=self._timeouts.get(method_name, timeout))
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 386, in call
    call = self.call_async(*reqs)
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 392, in call_async
    self.call_cb(call.callback, *reqs)
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 408, in call_cb
    self._transport.send(ctx.encode())
  File "/usr/lib/python2.7/site-packages/yajsonrpc/stompreactor.py", line 6**FILTERED**4, in send
    headers,
  File "/usr/lib/python2.7/site-packages/yajsonrpc/stompreactor.py", line 448, in send
    raise stomp.Disconnected()
Disconnected
20**FILTERED**7-09-03 **FILTERED**2:00:34,637+0300 ERROR otopi.context context._executeMethod:**FILTERED**5**FILTERED*
* Failed to execute stage 'Clean up': 


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.2.0-0.0.master.20170831115238.git6d9b02e.el7.centos.noarch

How reproducible:
100%
Reproduced on several different clean RHEL7.4 hosts.

Steps to Reproduce:
1.Deploy SHE over NFS on RHEL7.4 over NFS.

Actual results:
Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'values'

Expected results:
Deployment should succeed.

Additional info:
Components and deployment details available from http://pastebin.test.redhat.com/513207.
Sosreport and setup log attached.

Comment 1 Yaniv Kaul 2017-09-03 10:53:46 UTC
Does it work on iSCSI? Did it work in a previous build? Was it caught in tier 1 automation as well?

Comment 2 Nikolai Sednev 2017-09-03 11:25:49 UTC
(In reply to Yaniv Kaul from comment #1)
> Does it work on iSCSI?
I did not tested it on it yet, only on NFS.
> Did it work in a previous build?
Yes.
> Was it caught in tier 1 automation as well?
Kobi, can you please provide us with the info regarding this?

Comment 3 Doron Fediuck 2017-09-03 13:47:36 UTC
Looking at the logs it happens when waiting for engine to come up-
[ INFO  ] Waiting for engine to start...
[ ERROR ] Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'values'

so it may be an issue with the daemon response.

Comment 4 Kobi Hakimi 2017-09-03 14:41:36 UTC
Last time I ran tier1 it was with upgrade flow [1] and it runs as expected.
Today I ran cleaner, reprovision and deploy on hosted engine environment [2] and got the same error


[1] - https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv-4.2-ge-upgrade-flow-tier1/14/console
[2] - https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2_dev/job/rhv-4.2-ge-flow-infra/212/console

Comment 5 Simone Tiraboschi 2017-09-04 07:50:50 UTC
The root error is hosted-engine-setup logs is here:

2017-09-03 12:00:34,084+0300 INFO otopi.plugins.gr_he_common.core.misc misc.engine_vm_up_check:161 Waiting for engine to start...
Cannot connect to the HA daemon, please check the logs.
[Errno 2] No such file or directory
Cannot connect to the HA daemon, please check the logs.
2017-09-03 12:00:34,089+0300 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/core/misc.py", line 164, in engine_vm_up_check
    if vmstatus.get_status()['engine_vm_up']:
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 170, in get_status
    for host in status['all_host_stats'].values():
AttributeError: 'NoneType' object has no attribute 'values'



So hosted-engine-setup fails getting VM status from ovirt-ha-agent.
If we check ovirt-ha-broker logs indeed we see that's full of: 

Listener::ERROR::2017-09-03 12:13:46,083::storage_broker::147::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata from /var/run/vdsm/storage/699db583-f5b0-43bd-b769-91ab0627f5d5/a8318a91-f70a-4456-9778-34931a913541/aa45d041-350d-4023-9d70-040f3e4a1395
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 138, in get_raw_stats_for_service_type
    f = os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)

Comment 6 Denis Chaplygin 2017-09-04 13:12:44 UTC
It seems to be a race condition between setup and broker:
2017-09-04 14:51:21,925+0200 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/bin/sys
temctl', 'start', 'ovirt-ha-broker.service') stderr:


2017-09-04 14:51:21,927+0200 DEBUG otopi.context context._executeMethod:128 Stage closeup METHOD otopi.plugins.gr_he_
common.core.misc.Plugin._persist_files_start
2017-09-04 14:51:21,928+0200 DEBUG otopi.context context._executeMethod:134 condition False
2017-09-04 14:51:21,929+0200 DEBUG otopi.context context._executeMethod:128 Stage closeup METHOD otopi.plugins.gr_he_
common.core.misc.Plugin.engine_vm_up_check
2017-09-04 14:51:21,930+0200 INFO otopi.plugins.gr_he_common.core.misc misc.engine_vm_up_check:161 Waiting for engine
 to start...
Cannot connect to the HA daemon, please check the logs.
[Errno 2] No such file or directory
Cannot connect to the HA daemon, please check the logs.
2017-09-04 14:51:21,934+0200 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/core/misc.py", line 164, in engine_vm_up_check
    if vmstatus.get_status()['engine_vm_up']:
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 172, in get_status
    for host in status['all_host_stats'].values():
AttributeError: 'NoneType' object has no attribute 'values'
2017-09-04 14:51:21,935+0200 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'values'



Setup starts broker and immediately makes a request for it. Unfortunately broker validates storage mounts, symlinks and other stuff on start-up, so it make take up to several seconds before broker will be operational. 

I would recommend to make several (3-5 or 10) retries with 5 second delays between them.

Comment 7 Simone Tiraboschi 2017-09-04 16:28:45 UTC
With https://gerrit.ovirt.org/#/c/81405/ hosted-engine-setup doesn't fail and it continue to poll but ovirt-ha-agent still fails connecting the broker:

MainThread::ERROR::2017-09-04 18:24:23,334::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent
    return action(he)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper
    return he.start_monitoring()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 434, in start_monitoring
    if self._broker and self._broker.is_connected():
AttributeError: 'BrokerLink' object has no attribute 'is_connected'
MainThread::ERROR::2017-09-04 18:24:23,335::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent
MainThread::INFO::2017-09-04 18:24:23,335::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down
MainThread::INFO::2017-09-04 18:24:33,692::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.2.0-0.0.master.20170901100749 started

Comment 8 Denis Chaplygin 2017-09-05 07:33:41 UTC
It actually fails to disconnect. This should not be a problem for setup and normal operation. Anyway, i'm going to fix it today.

Comment 9 Red Hat Bugzilla Rules Engine 2017-09-06 06:16:21 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 10 Red Hat Bugzilla Rules Engine 2017-09-11 08:26:04 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 11 Nikolai Sednev 2017-09-17 07:45:13 UTC
Successfully deployed ovirt-hosted-engine-setup-2.2.0-0.0.master.20170913082146.git120d800.el7.centos.noarch with pre-installed rhvm-appliance-1:4.1.20170811.0-1.el7.

Moving to verified as initial issue not being reproduced anymore and works fine.

Comment 12 Nikolai Sednev 2017-09-17 10:47:30 UTC
Refining my previous comment #11:
Deployment was a success with ovirt-engine-appliance-4.2-20170916.1.el7.centos.noarch installed on host running with RHEL7.4 and ovirt-hosted-engine-setup-2.2.0-0.0.master.20170913082146.git120d800.el7.centos.noarch.

Comment 13 Sandro Bonazzola 2017-12-20 11:43:23 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.