Bug 1966476

Summary: Undercloud FFU 13->16.2 fails during 'Check Controllers availability'
Product: Red Hat OpenStack Reporter: Luca Miccini <lmiccini>
Component: openstack-tripleo-heat-templatesAssignee: Jose Luis Franco <jfrancoa>
Status: CLOSED ERRATA QA Contact: Jason Grosso <jgrosso>
Severity: urgent Docs Contact: Vlada Grosu <vgrosu>
Priority: urgent    
Version: 16.2 (Train)CC: aschultz, bdobreli, gchamoul, gregraka, jamsmith, jfrancoa, jjoyce, jpretori, jschluet, kecarter, lbezdick, ltoscano, mburns, michele, omcgonag, sgolovat, shrjoshi, slinaber, spower, tvignaud, vgrosu
Target Milestone: betaKeywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.5.1-2.20210603174816.el8ost.4 Doc Type: If docs needed, set a value
Doc Text:
Cause: Consequence: Fix: Result:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-15 07:15:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1921112    

Description Luca Miccini 2021-06-01 09:04:22 UTC
Description of problem:

During a ffu from 13 to 16.2 the undercloud upgrade step fails with:

2021-06-01 08:47:21 | PLAY [Server deployments] ******************************************************
2021-06-01 08:47:21 | 2021-06-01 08:47:21.083944 | 5254002d-d3be-786f-775c-000000000067 |       TASK | Basic Network Validation
2021-06-01 08:47:21 | 2021-06-01 08:47:21.137364 | 5254002d-d3be-786f-775c-000000000067 |     TIMING | Basic Network Validation | undercloud-0 | 0:00:37.215935 | 0.05s
2021-06-01 08:47:21 | 2021-06-01 08:47:21.208819 | 5254002d-d3be-786f-775c-000000000427 |       TASK | Collect default network fact
2021-06-01 08:47:21 | 2021-06-01 08:47:21.565184 | 5254002d-d3be-786f-775c-000000000427 |         OK | Collect default network fact | undercloud-0
2021-06-01 08:47:21 | 2021-06-01 08:47:21.565915 | 5254002d-d3be-786f-775c-000000000427 |     TIMING | tripleo_nodes_validation : Collect default network fact | undercloud-0 | 0:00:37.644485 | 0.36s
2021-06-01 08:47:21 | 2021-06-01 08:47:21.623284 | 5254002d-d3be-786f-775c-000000000428 |       TASK | Check Default IPv4 Gateway availability
2021-06-01 08:47:21 | 2021-06-01 08:47:21.827982 | 5254002d-d3be-786f-775c-000000000428 |         OK | Check Default IPv4 Gateway availability | undercloud-0
2021-06-01 08:47:21 | 2021-06-01 08:47:21.828751 | 5254002d-d3be-786f-775c-000000000428 |     TIMING | tripleo_nodes_validation : Check Default IPv4 Gateway availability | undercloud-0 | 0:00:37.907321 | 0.21s
2021-06-01 08:47:21 | 2021-06-01 08:47:21.886261 | 5254002d-d3be-786f-775c-000000000429 |       TASK | Check Controllers availability
2021-06-01 08:47:22 | 2021-06-01 08:47:22.090857 | 5254002d-d3be-786f-775c-000000000429 |      FATAL | Check Controllers availability | undercloud-0 | item=192.168.24.1 | error={"ansible_loop_var": "controller", "changed": false, "cmd": ["ping", "-w", "10", "-c", "1", "192.168.24.1"], "controller": "192.168.24.1", "delta": "0:00:00.003011", "end": "2021-06-01 08:47:22.045013", "msg": "non-zero return code", "rc": 1, "start": "2021-06-01 08:47:22.042002", "stderr": "", "stderr_lines": [], "stdout": "PING 192.168.24.1 (192.168.24.1) 56(84) bytes of data.\nFrom 172.16.0.1 icmp_seq=1 Destination Port Unreachable\n\n--- 192.168.24.1 ping statistics ---\n1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms", "stdout_lines": ["PING 192.168.24.1 (192.168.24.1) 56(84) bytes of data.", "From 172.16.0.1 icmp_seq=1 Destination Port Unreachable", "", "--- 192.168.24.1 ping statistics ---", "1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms"]}
2021-06-01 08:47:22 | 2021-06-01 08:47:22.091645 | 5254002d-d3be-786f-775c-000000000429 |     TIMING | tripleo_nodes_validation : Check Controllers availability | undercloud-0 | 0:00:38.170219 | 0.21s
2021-06-01 08:47:22 | 2021-06-01 08:47:22.092998 | 5254002d-d3be-786f-775c-000000000429 |     TIMING | tripleo_nodes_validation : Check Controllers availability | undercloud-0 | 0:00:38.171579 | 0.21s


this because br-ctlplane is not up:


1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: em0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:46:60:32 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::5054:ff:fe46:6032/64 scope link 
       valid_lft forever preferred_lft forever
3: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:2d:d3:be brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.84/24 brd 172.16.0.255 scope global dynamic noprefixroute em1
       valid_lft 3355sec preferred_lft 3355sec
    inet6 fe80::5054:ff:fe2d:d3be/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: em2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:80:fb:db brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.6/24 brd 10.0.0.255 scope global dynamic noprefixroute em2
       valid_lft 3547sec preferred_lft 3547sec
    inet6 2620:52:0:13b8::fe:79/128 scope global dynamic noprefixroute 
       valid_lft 1989sec preferred_lft 1989sec
    inet6 fe80::5054:ff:fe80:fbdb/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6a:44:f4:31:6a:a6 brd ff:ff:ff:ff:ff:ff
8: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 7e:da:3c:5a:16:45 brd ff:ff:ff:ff:ff:ff
9: br-ctlplane: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 52:54:00:46:60:32 brd ff:ff:ff:ff:ff:ff

Comment 1 Luca Miccini 2021-06-01 09:10:59 UTC
fwiw openstack undercloud upgrade --no-validations doesn't seem to work (--no-validations seems to be ignored).

Comment 2 Alex Schultz 2021-06-02 15:15:37 UTC
These validations aren't controlled by --no-validations. There are specific flags in heat templates to disable them. Not certain why this is being run for the undercloud however because we shouldn't be doing node validations for the undercloud

Comment 9 Lukas Bezdicka 2021-06-08 06:28:58 UTC
UpgradeLeappToInstall: ['openvswitch2.13','ovn2.13']

Comment 28 errata-xmlrpc 2021-09-15 07:15:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483