Bug 2129882
| Summary: | [OVN][16.1] VM status spawning and ERROR on CI jenkins job | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Fiorella Yanac <fyanac> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Miguel Lavalle <mlavalle> |
| Status: | CLOSED ERRATA | QA Contact: | Roman Safronov <rsafrono> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 16.1 (Train) | CC: | averdagu, chrisw, eolivare, jschluet, mburns, mlavalle, ralonsoh, rsafrono, scohen, skaplons, spower, ykarel |
| Target Milestone: | z9 | Keywords: | AutomationBlocker, Regression, Triaged |
| Target Release: | 16.1 (Train on RHEL 8.2) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-11.3.2-1.20221013153257.29a02c1.el8ost python-networking-ovn-7.3.1-1.20221013173227.4e24f4c.el8ost | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-12-07 20:29:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2139775 | ||
| Bug Blocks: | |||
As shown here https://paste.openstack.org/show/bCjPy9xz3n5za2vMn6eZ/, after the migration from ml2/ovs to ml2/ovn, OVN is left configured to use br-migration instead of br-int. The consequence is the behavior described here when VMs are created https://paste.openstack.org/show/bG4LfSkK6LgwDV1obJ1p/. To fix this, I need to port https://review.opendev.org/c/openstack/neutron/+/848000 to: https://github.com/openstack/networking-ovn/blob/stable/train/migration/tripleo_environment/playbooks/ovn-migration.yml https://github.com/openstack/networking-ovn/blob/stable/train/migration/tripleo_environment/playbooks/roles/tripleo-update/templates/generate-ovn-extras.sh.j2 And also port https://review.opendev.org/c/openstack/tripleo-heat-templates/+/847999 to the stable/train branch Upstream backports: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/860609 https://review.opendev.org/c/openstack/networking-ovn/+/860610 Done For the rhos-16.1-patches branches: https://code.engineering.redhat.com/gerrit/c/openstack-tripleo-heat-templates/+/432117 https://code.engineering.redhat.com/gerrit/c/networking-ovn/+/432118 Impossible to verify until BZ2139775 - [OVN][16.1] Migration from OVS to OVN hangs on "Sync neutron db with OVN db" task is fixed Moving back to MODIFIED as python-networking-ovn-7.3.1-1.20221013173227.4e24f4c.el8ost is not yet in a compose, so cannot be tested Verified on RHOS-16.1-RHEL-8-20221116.n.1 with openstack-tripleo-heat-templates-11.3.2-1.20221013153259.el8ost.noarch and python3-networking-ovn-migration-tool-7.3.1-1.20221013173227.4e24f4c.el8ost.noarch Verified by running ovs2ovn d/s CI job with full tempest run after migrating to OVN. There were no issues with spawning VMs after migrating to OVN. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenStack 16.1.9 (openstack-tripleo-heat-templates) security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8796 |
I went through all the jobs under OSP16.1 Phase 3. All the jobs named "*ml2ovs-to-ovn-migration_*" have multiple failures with message "failed to reach ACTIVE status and task state "None" within the required time ", which seems to be the subject of this bz. I am going to focus on this failure. Other jobs (not named "*ml2ovs-to-ovn-migration_*") report other failures, but not a lot of test cases failing. Some of those failures repeat across jobs: 1) neutron_plugin.tests.scenario.test_dvr_ovn.OvnDvrAdvancedTest.test_dvr_vip_failover Traceback (most recent call last): File "/home/stack/plugins/tempest_neutron_plugin/neutron_plugin/tests/scenario/test_dvr_ovn.py", line 703, in test_dvr_vip_failover self.assertCountEqual(expected_routing_nodes, actual_routing_nodes) File "/usr/lib64/python3.6/unittest/case.py", line 1200, in assertCountEqual self.fail(msg) File "/usr/lib/python3.6/site-packages/unittest2/case.py", line 693, in fail raise self.failureException(msg) AssertionError: Element counts were not equal: First has 1, Second has 0: 'compute-1' First has 1, Second has 0: 'compute-0' 2) neutron_plugin.tests.scenario.test_internal_dns.InternalDNSInterruptionsAdvancedTestOvn.test_ovn_dns_name_after_networker_reboot ssh = self._get_ssh_connection() File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 128, in _get_ssh_connection password=self.password) tempest.lib.exceptions.SSHTimeout: Connection to the 10.0.0.204 via SSH timed out. User: cloud-user, Password: None