Bug 1646691 - There is no clear indication of the failure root cause when config_download_timeout is reached
Summary: There is no clear indication of the failure root cause when config_download_t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 14.0 (Rocky)
Assignee: Thomas Hervé
QA Contact: Artem Hrechanychenko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-05 20:57 UTC by Marius Cornea
Modified: 2019-01-11 11:54 UTC (History)
6 users (show)

Fixed In Version: openstack-tripleo-common-9.4.1-0.20181012010877.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-11 11:54:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1801902 0 None None None 2018-11-06 10:19:21 UTC
OpenStack gerrit 615653 0 'None' MERGED Fix config-download timeout 2020-12-30 13:19:16 UTC
Red Hat Product Errata RHEA-2019:0045 0 None None None 2019-01-11 11:54:38 UTC

Description Marius Cornea 2018-11-05 20:57:48 UTC
Description of problem:

When deployment times out because config_download_timeout is reached there is no clear indication why the failure occurred and also the ansible-playbook process keeps running despite the deploy process already failed.

openstack overcloud deploy shows:

[...]

TASK [Run puppet host configuration for step 4] ********************************
Monday 05 November 2018  14:50:08 -0500 (0:00:00.376)       1:02:03.510 ******* 
changed: [ceph-1] => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": true}

Ansible failed, check log at /var/lib/mistral/overcloud/ansible.log.


Checking /var/lib/mistral/overcloud/ansible.log there is no failure:

(undercloud) [stack@undercloud-0 ~]$ tail -20 /var/lib/mistral/overcloud/ansible.log

2018-11-05 14:52:28,268 p=2030 u=mistral |  TASK [Check if /var/lib/docker-container-startup-configs.json already exists] ***
2018-11-05 14:52:28,269 p=2030 u=mistral |  Monday 05 November 2018  14:52:28 -0500 (0:00:00.129)       1:04:23.601 ******* 
2018-11-05 14:52:29,154 p=2030 u=mistral |  ok: [compute-1] => {"changed": false, "stat": {"atime": 1541445893.8447168, "attr_flags": "", "attributes": [], "block_size": 4096, "blocks": 40, "charset": "us-ascii", "checksum": "3912b2dc4f205987f32494f2be48c69fb2f94820", "ctime": 1541444527.103556, "dev": 64514, "device_type": 0, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 71303346, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mimetype": "text/plain", "mode": "0600", "mtime": 1541444526.733556, "nlink": 1, "path": "/var/lib/docker-container-startup-configs.json", "pw_name": "root", "readable": true, "rgrp": false, "roth": false, "rusr": true, "size": 17498, "uid": 0, "version": "18446744071911736316", "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}
2018-11-05 14:52:29,234 p=2030 u=mistral |  ok: [ceph-1] => {"changed": false, "stat": {"atime": 1541445894.124508, "attr_flags": "", "attributes": [], "block_size": 4096, "blocks": 8, "charset": "us-ascii", "checksum": "4d9644b34d6525683fad19a44eba7d47eabaee2e", "ctime": 1541444526.7821229, "dev": 64514, "device_type": 0, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 2097924, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mimetype": "text/plain", "mode": "0600", "mtime": 1541444526.4689665, "nlink": 1, "path": "/var/lib/docker-container-startup-configs.json", "pw_name": "root", "readable": true, "rgrp": false, "roth": false, "rusr": true, "size": 1477, "uid": 0, "version": "18446744071703452371", "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}
2018-11-05 14:52:29,286 p=2030 u=mistral |  ok: [controller-2] => {"changed": false, "stat": {"atime": 1541445893.6623106, "attr_flags": "", "attributes": [], "block_size": 4096, "blocks": 280, "charset": "us-ascii", "checksum": "21ebc25e2b943adb4d2aa26cc51ceb9952bc69d0", "ctime": 1541444526.6314545, "dev": 64514, "device_type": 0, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 56623269, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mimetype": "text/plain", "mode": "0600", "mtime": 1541444526.2672725, "nlink": 1, "path": "/var/lib/docker-container-startup-configs.json", "pw_name": "root", "readable": true, "rgrp": false, "roth": false, "rusr": true, "size": 141598, "uid": 0, "version": "1829233034", "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}
2018-11-05 14:52:29,315 p=2030 u=mistral |  ok: [compute-0] => {"changed": false, "stat": {"atime": 1541445893.9683187, "attr_flags": "", "attributes": [], "block_size": 4096, "blocks": 40, "charset": "us-ascii", "checksum": "3912b2dc4f205987f32494f2be48c69fb2f94820", "ctime": 1541444527.086664, "dev": 64514, "device_type": 0, "executable": false, "exists": true, "gid": 0, "gr_name": "root", "inode": 73400487, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mimetype": "text/plain", "mode": "0600", "mtime": 1541444526.697664, "nlink": 1, "path": "/var/lib/docker-container-startup-configs.json", "pw_name": "root", "readable": true, "rgrp": false, "roth": false, "rusr": true, "size": 17498, "uid": 0, "version": "18446744072302513362", "wgrp": false, "woth": false, "writeable": true, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}


We can only see the timeout error in /var/log/containers/mistral/executor.log:

 3800 seconds: Timeout: 3800 seconds
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor Traceback (most recent call last):
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib/python2.7/site-packages/mistral/executors/default_executor.py", line 114, in run_action
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     result = action.run(action_ctx)
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib/python2.7/site-packages/tripleo_common/actions/ansible.py", line 558, in run
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     for line in iter(process.stdout.readline, b''):
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib64/python2.7/socket.py", line 447, in readline
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     data = self._sock.recv(self._rbufsize)
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib/python2.7/site-packages/eventlet/greenio/py2.py", line 160, in recv
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     data = os.read(self._fileno, buflen)
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib/python2.7/site-packages/eventlet/green/os.py", line 50, in read
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     hubs.trampoline(fd, read=True)
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py", line 162, in trampoline
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     return hub.switch()
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor   File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor     return self.greenlet.switch()
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor Timeout: 3800 seconds
2018-11-05 14:51:22.533 1 ERROR mistral.executors.default_executor 

Version-Release number of selected component (if applicable):
2018-10-30.6 puddle

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud with --timeout option set to give enough time for config download to finish


Actual results:
Deployment fails without clear failure reason. Also ansible-playbook process used for running deploy_steps_playbook.yaml keeps running:

42430     385709  9.2  0.5 415288 166400 ?       Sl   13:48  11:56 /usr/bin/python2 /bin/ansible-playbook -v /var/lib/mistral/overcloud/deploy_steps_playbook.yaml --become --timeout 240 --inventory-file /var/lib/mistral/overcloud/tripleo-ansible-inventory.yaml


Expected results:
Give a clear indication of the failure reason and clear the processes triggered during deploy.

Additional info:

Comment 2 Bob Fournier 2018-11-16 14:52:10 UTC
Patch was included when building pkg for a different patch, setting FIV and moving to MODIFIED.

Comment 20 errata-xmlrpc 2019-01-11 11:54:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045


Note You need to log in before you can comment on or make changes to this bug.