Bug 2014025 - VM instances on compute-0 unable to contact metadata
Summary: VM instances on compute-0 unable to contact metadata
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Slawek Kaplonski
QA Contact: Eduardo Olivares
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-14 10:02 UTC by Eduardo Olivares
Modified: 2022-03-23 22:12 UTC (History)
4 users (show)

Fixed In Version: openstack-neutron-15.3.5-2.20220113150030.94e4cbb.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-23 22:12:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1947993 0 None None None 2021-10-21 09:33:25 UTC
OpenStack gerrit 814892 0 None NEW Remove router_info from agent's cache when processing failed 2021-10-21 10:10:29 UTC
Red Hat Issue Tracker OSP-10418 0 None None None 2021-11-12 09:08:25 UTC
Red Hat Product Errata RHBA-2022:1001 0 None None None 2022-03-23 22:12:39 UTC

Description Eduardo Olivares 2021-10-14 10:02:55 UTC
Description of problem:
This issue has been reproduced on a virtualized OSP16.2 environment with two compute nodes and with ML2/OVS and DVR enabled.

It was reproduced running the tobiko scenario tests with "tox -e scenario".

Several VMs using a common tenant network and a router connecting them to the external network were created. Some of these VMs were spawned on the compute-0 and others on the compute-1. Those created on the comp-0 could not obtain information from the metadata service, while metadata worked fine for those created on comp-1.


We found that the following rule was not included in the iptables rules from the corresponding qrouter namespace on the compute-0:
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697

The rule existed on the compute-1.

The issue was fixed when the neutron_l3_agent container was restarted on comp-0.



Version-Release number of selected component (if applicable):
RHOS-16.2-RHEL-8-20211006.n.1
openstack-neutron-15.3.5-2.20210608154815.el8ost.4.noarch


How reproducible:
Not very often, but I believe it affects some tobiko tests during the create-resources stage from time to time.
For example, the test_ssh failed here due to metadata not reachable by a cirros instance:
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/staging/DFG-network-networking-ovn-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-ml2ovs-to-ovn-migration/12/infrared/.workspaces/workspace_2021-10-04_11-14-05/tobiko_create-resources/tobiko_create-resources_create_resources_scenario.html


Steps to Reproduce:
1. run tobiko create resources test stage
2.
3.

Comment 15 errata-xmlrpc 2022-03-23 22:12:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1001


Note You need to log in before you can comment on or make changes to this bug.