Bug 1783978 - [OSP16] Minor update from beta fails on overcloud update, 'Failed running container for neutron'
Summary: [OSP16] Minor update from beta fails on overcloud update, 'Failed running con...
Keywords:
Status: CLOSED DUPLICATE of bug 1790467
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-neutron
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Kamil Sambor
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-16 11:51 UTC by Roman Safronov
Modified: 2020-01-21 14:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-21 14:40:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Roman Safronov 2019-12-16 11:51:55 UTC
Description of problem:
Minor update CI job failed on overcloud update stage.

https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-update-16_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/17/


From the job log:

TASK [tripleo-upgrade : was the overcloud minor update successful?] ************
task path: /home/rhos-ci/jenkins/workspace/DFG-network-networking-ovn-update-16_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/infrared/plugins/tripleo-upgrade/infrared_plugin/roles/tripleo-upgrade/tasks/update/overcloud_update_run.yml:16
Monday 16 December 2019  00:23:05 +0000 (0:00:04.630)       0:15:55.722 ******* 
FAILED - RETRYING: was the overcloud minor update successful? (25200 retries left).
FAILED - RETRYING: was the overcloud minor update successful? (25199 retries left).
FAILED - RETRYING: was the overcloud minor update successful? (25198 retries left).
...

FAILED - RETRYING: was the overcloud minor update successful? (25023 retries left).
FAILED - RETRYING: was the overcloud minor update successful? (25022 retries left).
FAILED - RETRYING: was the overcloud minor update successful? (25021 retries left).
failed: [undercloud-0] (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'Controller', u'ansible_job_id': u'639298699136.57241', 'failed': False, u'started': 1, 'changed': True, 'item': u'Controller', u'finished': 0, u'results_file': u'/home/stack/.ansible_async/639298699136.57241', '_ansible_ignore_errors': True, '_ansible_no_log': False}) => {
    "ansible_job_id": "639298699136.57241", 
    "async_result_item": {
        "ansible_job_id": "639298699136.57241", 
        "changed": true, 
        "failed": false, 
        "finished": 0, 
        "item": "Controller", 
        "results_file": "/home/stack/.ansible_async/639298699136.57241", 
        "started": 1
    }, 
    "attempts": 181, 
    "changed": true, 
    "cmd": "set -o pipefail\n bash /home/stack/overcloud_update_run-Controller.sh 2>&1 | awk '{ print strftime(\"%Y-%m-%d %H:%M:%S |\"), $0; fflush(); }' > /home/stack/overcloud_update_run_Controller.log", 
    "delta": "0:16:42.581016", 
    "end": "2019-12-16 00:39:44.525195", 
    "finished": 1, 
    "rc": 1, 
    "start": "2019-12-16 00:23:01.944179"
}

MSG:

non-zero return code




From overcloud_update_run_Controller.log

2019-12-16 00:39:44 |         "<13>Dec 16 00:39:39 puppet-user: Error: Function lookup() did not find a value for the name 'service_names'",
2019-12-16 00:39:44 |         " attempt(s): 3",
2019-12-16 00:39:44 |         "2019-12-16 00:39:42,560 WARNING: 1011854 -- Retrying running container: neutron",
2019-12-16 00:39:44 |         "2019-12-16 00:39:42,560 ERROR: 1011854 -- Failed running container for neutron",
2019-12-16 00:39:44 |         "2019-12-16 00:39:42,560 INFO: 1011854 -- Finished processing puppet configs for neutron",
2019-12-16 00:39:44 |         "2019-12-16 00:39:42,562 ERROR: 1011837 -- ERROR configuring neutron"
2019-12-16 00:39:44 |     ]
2019-12-16 00:39:44 | }
2019-12-16 00:39:44 | 
2019-12-16 00:39:44 | PLAY RECAP *********************************************************************
2019-12-16 00:39:44 | controller-0               : ok=246  changed=110  unreachable=0    failed=1    skipped=560  rescued=0    ignored=2   
2019-12-16 00:39:44 | controller-1               : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
2019-12-16 00:39:44 | controller-2               : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   



neutron/server.log on controller-0 has many different errors starting from these:
2019-12-16 00:26:45.497 31 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6642: no response to inactivity probe after 60 seconds, disconnecting
2019-12-16 00:26:45.498 31 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6642: connection dropped
2019-12-16 00:26:45.499 29 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6642: no response to inactivity probe after 60 seconds, disconnecting
2019-12-16 00:26:45.499 29 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6642: connection dropped
2019-12-16 00:26:45.500 27 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6642: no response to inactivity probe after 60 seconds, disconnecting
2019-12-16 00:26:45.501 27 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6642: connection dropped
2019-12-16 00:26:45.501 29 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6641: no response to inactivity probe after 60 seconds, disconnecting
2019-12-16 00:26:45.501 29 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6641: connection dropped
2019-12-16 00:26:45.501 30 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6641: no response to inactivity probe after 60 seconds, disconnecting
2019-12-16 00:26:45.502 30 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:172.17.1.128:6641: connection dropped




Version-Release number of selected component (if applicable):
I ran minor update from passed_phase2 (which is also Beta-1.0 or RHOS_TRUNK-16.0-RHEL-8-20191206.n.1) to passed_phase1 (RHOS_TRUNK-16.0-RHEL-8-20191213.n.5)



How reproducible:
Tried update between these versions for the first time and the issue occurred


Steps to Reproduce:
1. Run the following CI job: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-update-16_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/


Actual results:
CI job fails on overcloud update failing to run neutron container when updating controller-0

Expected results:
Minor update passed successfully

Additional info:


Note You need to log in before you can comment on or make changes to this bug.