Bug 2210534 - 'Revert OVN migration' procedure fails on checking server network availability
Summary: 'Revert OVN migration' procedure fails on checking server network availability
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 17.1
Assignee: Yatin Karel
QA Contact: Roman Safronov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-28 08:54 UTC by Roman Safronov
Modified: 2023-12-15 04:26 UTC (History)
6 users (show)

Fixed In Version: openstack-neutron-18.6.1-1.20230518200965.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-16 01:15:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-25421 0 None None None 2023-05-28 08:55:03 UTC
Red Hat Product Errata RHEA-2023:4577 0 None None None 2023-08-16 01:15:46 UTC

Description Roman Safronov 2023-05-28 08:54:45 UTC
Description of problem:

'Revert OVN migration' procedure fails on overcloud_opdate stage during execution of tripleo_nodes_validation tasks (PLAY [Server network validation]). Some nodes are not responding, see error from overcloud_deploy log:

2023-05-27 02:11:08.862553 | 52540048-e52d-8a7e-f8d7-0000000030e9 |      FATAL | Check Default IPv4 Gateway availability | compute-1 | error={"attempts": 10, "changed": false, "cmd": ["ping", "-w", "10", "-c", "5", "10.0.0.1"], "delta": "0:00:01.196542", "end": "2023-05-27 02:11:08.815152", "msg": "non-zero return code", "rc": 1, "start": "2023-05-27 02:11:07.618610", "stderr": "", "stderr_lines": [], "stdout": "PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.\nFrom 10.0.0.131 icmp_seq=1 Destination Host Unreachable\nFrom 10.0.0.131 icmp_seq=2 Destination Host Unreachable\n\n--- 10.0.0.1 ping statistics ---\n2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1062ms\npipe 2", "stdout_lines": ["PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.", "From 10.0.0.131 icmp_seq=1 Destination Host Unreachable", "From 10.0.0.131 icmp_seq=2 Destination Host Unreachable", "", "--- 10.0.0.1 ping statistics ---", "2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1062ms", "pipe 2"]}
2023-05-27 02:11:08.863904 | 52540048-e52d-8a7e-f8d7-0000000030e9 |     TIMING | tripleo_nodes_validation : Check Default IPv4 Gateway availability | compute-1 | 0:11:26.476122 | 618.96s
2023-05-27 02:11:14.020416 | 52540048-e52d-8a7e-f8d7-0000000030e9 |      FATAL | Check Default IPv4 Gateway availability | compute-0 | error={"attempts": 10, "changed": false, "cmd": ["ping", "-w", "10", "-c", "5", "10.0.0.1"], "delta": "0:00:02.908385", "end": "2023-05-27 02:11:13.975070", "msg": "non-zero return code", "rc": 1, "start": "2023-05-27 02:11:11.066685", "stderr": "", "stderr_lines": [], "stdout": "PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.\nFrom 10.0.0.119 icmp_seq=1 Destination Host Unreachable\nFrom 10.0.0.119 icmp_seq=2 Destination Host Unreachable\nFrom 10.0.0.119 icmp_seq=3 Destination Host Unreachable\n\n--- 10.0.0.1 ping statistics ---\n3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2070ms\npipe 3", "stdout_lines": ["PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.", "From 10.0.0.119 icmp_seq=1 Destination Host Unreachable", "From 10.0.0.119 icmp_seq=2 Destination Host Unreachable", "From 10.0.0.119 icmp_seq=3 Destination Host Unreachable", "", "--- 10.0.0.1 ping statistics ---", "3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2070ms", "pipe 3"]}
2023-05-27 02:11:14.021456 | 52540048-e52d-8a7e-f8d7-0000000030e9 |     TIMING | tripleo_nodes_validation : Check Default IPv4 Gateway availability | compute-0 | 0:11:31.633684 | 624.31s
2023-05-27 02:11:20.627932 | 52540048-e52d-8a7e-f8d7-0000000030e9 |      FATAL | Check Default IPv4 Gateway availability | networker-2 | error={"attempts": 10, "changed": false, "cmd": ["ping", "-w", "10", "-c", "5", "10.0.0.1"], "delta": "0:00:01.102085", "end": "2023-05-27 02:11:20.581607", "msg": "non-zero return code", "rc": 1, "start": "2023-05-27 02:11:19.479522", "stderr": "", "stderr_lines": [], "stdout": "PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.\nFrom 10.0.0.128 icmp_seq=1 Destination Host Unreachable\nFrom 10.0.0.128 icmp_seq=2 Destination Host Unreachable\n\n--- 10.0.0.1 ping statistics ---\n2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1031ms\npipe 2", "stdout_lines": ["PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.", "From 10.0.0.128 icmp_seq=1 Destination Host Unreachable", "From 10.0.0.128 icmp_seq=2 Destination Host Unreachable", "", "--- 10.0.0.1 ping statistics ---", "2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1031ms", "pipe 2"]}
2023-05-27 02:11:20.629618 | 52540048-e52d-8a7e-f8d7-0000000030e9 |     TIMING | tripleo_nodes_validation : Check Default IPv4 Gateway availability | networker-2 | 0:11:38.241837 | 631.33s


Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230525.n.1
python3-neutron-18.6.1-1.20230518200958.da43b03.el9ost.noarch

How reproducible:
100%, in case tempest and tobiko stages were running
In case d/s CI job is configured to skip tempest/tobiko stages and runs only 'ovn migration' and then 'restore ovs' stages the issue does not happen

Steps to Reproduce:

Found by d/s ovs2ovn CI jobs that perform the following scenario
1. Deploy OVS environment
2. Run tempest neutron and tobiko create-resources
3. Create backup of control plane nodes 
4. Perform migration from OVS to OVN according to official procedure
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.0/html/testing_migration_of_the_networking_service_to_the_ml2ovn_mechanism_driver/migrating-ovs-to-ovn
5. Run tempest neutron and tobiko check-resources
6. Restore control plane nodes from backup
7. Run /usr/share/ansible/neutron-ovn-migration/playbooks/revert.yml script
8. Run the initial overcloud deploy script (the same that was used in step 1) to update overcloud to use ovs

Actual results:
overcloud deploy script fails on server network validation

Expected results:
overcloud deploy script passes

Additional info:

Comment 12 Roman Safronov 2023-06-08 12:51:50 UTC
Verified that the issue does not happen on RHOS-17.1-RHEL-9-20230607.n.2 with openstack-neutron-ovn-migration-tool-18.6.1-1.20230518200966.el9ost.noarch

Comment 20 errata-xmlrpc 2023-08-16 01:15:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577

Comment 21 Red Hat Bugzilla 2023-12-15 04:26:18 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.