Bug 1907319

Summary: tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_router_rescheduling failed when there is a DOWN state L3 agent.
Product: Red Hat OpenStack Reporter: Keigo Noha <knoha>
Component: openstack-tempestAssignee: Lukas Piwowarski <lpiwowar>
Status: CLOSED EOL QA Contact: Martin Kopec <mkopec>
Severity: low Docs Contact:
Priority: low    
Version: 13.0 (Queens)CC: apevec, lhh, slinaber, udesale
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-11 20:55:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Keigo Noha 2020-12-14 08:39:17 UTC
Description of problem:
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_router_rescheduling failed when there is a DOWN state L3 agent.

Usually, all l3 agents are UP state. However, if the user doesn't want to host l3 agent on controller nodes after they add networker nodes to reduce the workload on controller nodes, disabling the l3 agent on controller nodes is needed.

In that case, tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_router_rescheduling failed.


Version-Release number of selected component (if applicable):
Current tempest in RHOSP13, 16 and upstream.

How reproducible:
Everytime

Steps to Reproduce:
1. Deploy Controllerx3, Networkerx3, Computex1 deployment.
2. Disable all agents on controller nodes.
3. Run tempest for tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_router_rescheduling 

Actual results:
The test fails.

Expected results:
The test won't fail

Additional info:
The current implementation has following logic to get the proper agents.

~~~
    def test_router_rescheduling(self):
        """Tests that router can be removed from agent and add to a new agent.

        1. Verify connectivity
        2. Remove router from all l3-agents
        3. Verify connectivity is down
        4. Assign router to new l3-agent (or old one if no new agent is
         available)
        5. Verify connectivity
        """

        # TODO(yfried): refactor this test to be used for other agents (dhcp)
        # as well

        list_hosts = (self.os_admin.routers_client.
                      list_l3_agents_hosting_router)
        schedule_router = (self.os_admin.network_agents_client.
                           create_router_on_l3_agent)
        unschedule_router = (self.os_admin.network_agents_client.
                             delete_router_from_l3_agent)

        agent_list_alive = set(
            a["id"] for a in
            self.os_admin.network_agents_client.list_agents(
                agent_type="L3 agent")['agents'] if a["alive"] is True
        )
        self._setup_network_and_servers()
~~~

agent_list_alive will have a list of l3 agents that is alive.
However, it should consider admin_state_up is true or else.

I modified the code like below and it works well.
~~~    def test_router_rescheduling(self):
        """Tests that router can be removed from agent and add to a new agent.

        1. Verify connectivity
        2. Remove router from all l3-agents
        3. Verify connectivity is down
        4. Assign router to new l3-agent (or old one if no new agent is
         available)
        5. Verify connectivity
        """

        # TODO(yfried): refactor this test to be used for other agents (dhcp)
        # as well

        list_hosts = (self.os_admin.routers_client.
                      list_l3_agents_hosting_router)
        schedule_router = (self.os_admin.network_agents_client.
                           create_router_on_l3_agent)
        unschedule_router = (self.os_admin.network_agents_client.
                             delete_router_from_l3_agent)

        agent_list_alive = set(
            a["id"] for a in
            self.os_admin.network_agents_client.list_agents(
                agent_type="L3 agent")['agents'] if a["alive"] is True and a["admin_state_up"] is True
        )
~~~

By the way, in RHOSP13, python-tempest package has the same test in neutron-tests-tempest package. To reduce the double effort for maintenance, I think we should backport https://review.opendev.org/c/openstack/tempest/+/640767.

Comment 1 Lon Hohberger 2023-07-11 20:55:46 UTC
OSP 13 was retired on June 27, 2023. No further work is expected to occur on this issue.