Description of problem: When trying to run a test on OCP on OSP environment the namespace isolation test fails (the connectivity between 2 namespaces succeeds when it shouldn't) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Create 2 namespaces with pod/service on each of them 2. Try to reach from the pod on one namespace the pod/service on another namespace 3. Actual results: The connectivity between 2 namespaces succeeds Expected results: The connectivity between 2 namespaces should fail Additional info: For me it happens in 50% of attempts by running our automation test on it in kuryr-tempest-plugin. The debug made by our R@D: pods IPs and SGs $ openstack port list | grep -e 10.11.11.11 -e 10.11.12.13 | 1d6ee8b6-7112-4dcd-b5ce-da2e47c36b00 | | fa:16:3e:51:45:cb | ip_address='10.11.11.11', subnet_id='40c2e3b5-ebe9-466a-bf28-841b226f6a36' | ACTIVE | | 264d16ae-40d6-4ff2-a2e2-2af23951dab4 | | fa:16:3e:98:b1:8d | ip_address='10.11.12.13', subnet_id='3064009d-f447-4d83-98db-a3b2aee69cb4' | ACTIVE | $ openstack port show 1d6ee8b6-7112-4dcd-b5ce-da2e47c36b00 +-----------------------+------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------------+------------------------------------------------------------------------------------------------------------------+ | admin_state_up | UP | | allowed_address_pairs | | | binding_host_id | compute-0.localdomain | | binding_profile | | | binding_vif_details | datapath_type='system', ovs_hybrid_plug='False', port_filter='True' | | binding_vif_type | ovs | | binding_vnic_type | normal | | created_at | 2019-03-12T15:27:59Z | | data_plane_status | None | | description | | | device_id | | | device_owner | trunk:subport | | dns_assignment | None | | dns_domain | None | | dns_name | None | | extra_dhcp_opts | | | fixed_ips | ip_address='10.11.11.11', subnet_id='40c2e3b5-ebe9-466a-bf28-841b226f6a36' | | id | 1d6ee8b6-7112-4dcd-b5ce-da2e47c36b00 | | mac_address | fa:16:3e:51:45:cb | | name | | | network_id | 0482639a-0d91-41c7-af63-504ae65992a4 | | port_security_enabled | True | | project_id | a6e6e3aafb5c4847918c9c86074c9d1a | | qos_policy_id | None | | revision_number | 5 | | security_group_ids | 0c86db57-aac2-40ec-9cd3-c2de252eb291, 870ad8ad-f548-4bbc-99c5-787c7633d405, 9b37f7be-da6d-4a7b-9841-08e5f7d9da45 | | status | ACTIVE | | tags | | | trunk_details | None | | updated_at | 2019-03-12T15:28:12Z | +-----------------------+------------------------------------------------------------------------------------------------------------------+ $ openstack port show 264d16ae-40d6-4ff2-a2e2-2af23951dab4 +-----------------------+------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------------+------------------------------------------------------------------------------------------------------------------+ | admin_state_up | UP | | allowed_address_pairs | | | binding_host_id | compute-0.localdomain | | binding_profile | | | binding_vif_details | datapath_type='system', ovs_hybrid_plug='False', port_filter='True' | | binding_vif_type | ovs | | binding_vnic_type | normal | | created_at | 2019-03-12T15:28:10Z | | data_plane_status | None | | description | | | device_id | | | device_owner | trunk:subport | | dns_assignment | None | | dns_domain | None | | dns_name | None | | extra_dhcp_opts | | | fixed_ips | ip_address='10.11.12.13', subnet_id='3064009d-f447-4d83-98db-a3b2aee69cb4' | | id | 264d16ae-40d6-4ff2-a2e2-2af23951dab4 | | mac_address | fa:16:3e:98:b1:8d | | name | | | network_id | 313b52de-fda8-454f-aa6a-0ba7071838bb | | port_security_enabled | True | | project_id | a6e6e3aafb5c4847918c9c86074c9d1a | | qos_policy_id | None | | revision_number | 5 | | security_group_ids | 7bf37e7e-4942-4e72-b009-2d236c89b272, 870ad8ad-f548-4bbc-99c5-787c7633d405, 9b37f7be-da6d-4a7b-9841-08e5f7d9da45 | | status | ACTIVE | | tags | | | trunk_details | None | | updated_at | 2019-03-12T15:28:21Z | +-----------------------+------------------------------------------------------------------------------------------------------------------+ LoadBalancer SG: $ openstack security group show 68d81bbc-ce2f-4a2a-bcf8-61eed09cc247 | grep ingress | rules | created_at='2019-03-12T15:29:40Z', description='test/demo:TCP:80', direction='ingress', ethertype='IPv4', id='24c9bb8a-e61b-4633-b761-8c163e017b2b', port_range_max='80', port_range_min='80', protocol='tcp', remote_ip_prefix='192.168.99.0/24', updated_at='2019-03-12T15:29:40Z' | | | created_at='2019-03-12T15:29:39Z', description='test/demo:TCP:80', direction='ingress', ethertype='IPv4', id='2a8cc114-c46d-432f-89ec-563d29e02173', port_range_max='80', port_range_min='80', protocol='tcp', remote_group_id='aeb7071e-9af4-4383-b8b9-7ac7c8b36c40', updated_at='2019-03-12T15:29:39Z' | | | created_at='2019-03-12T15:29:40Z', description='test/demo:TCP:80', direction='ingress', ethertype='IPv4', id='3ff46605-7b07-4749-b9eb-c002ba95c006', port_range_max='80', port_range_min='80', protocol='tcp', remote_ip_prefix='172.30.0.0/16', updated_at='2019-03-12T15:29:40Z' | | | created_at='2019-03-12T15:29:39Z', description='test/demo:TCP:80', direction='ingress', ethertype='IPv4', id='6fcea2b7-134a-48f0-80ed-ae8391e78f83', port_range_max='80', port_range_min='80', protocol='tcp', remote_group_id='0c86db57-aac2-40ec-9cd3-c2de252eb291', updated_at='2019-03-12T15:29:39Z' | List of SG ids: $ openstack security group list | grep 0c86db57-aac2-40ec-9cd3-c2de252eb291 | 0c86db57-aac2-40ec-9cd3-c2de252eb291 | ns/test-sg | | a6e6e3aafb5c4847918c9c86074c9d1a | [] | (shiftstack) [stack@undercloud-0 ~]$ openstack security group list | grep 7bf37e7e-4942-4e72-b009-2d236c89b272 | 7bf37e7e-4942-4e72-b009-2d236c89b272 | ns/test2-sg | | a6e6e3aafb5c4847918c9c86074c9d1a | [] | (shiftstack) [stack@undercloud-0 ~]$ openstack security group list | grep 9b37f7be-da6d-4a7b-9841-08e5f7d9da45 ^[[A| 9b37f7be-da6d-4a7b-9841-08e5f7d9da45 | openshift-ansible-openshift.example.com-allow_from_default | Give access to the services and pods from the default namespace | a6e6e3aafb5c4847918c9c86074c9d1a | [] | (shiftstack) [stack@undercloud-0 ~]$ openstack security group list | grep 870ad8ad-f548-4bbc-99c5-787c7633d405 | 870ad8ad-f548-4bbc-99c5-787c7633d405 | openshift-ansible-openshift.example.com-pod-service-secgrp | Give services and nodes access to the pods | a6e6e3aafb5c4847918c9c86074c9d1a | [] | (shiftstack) [stack@undercloud-0 ~]$ openstack security group list | grep aeb7071e-9af4-4383-b8b9-7ac7c8b36c40 | aeb7071e-9af4-4383-b8b9-7ac7c8b36c40 | openshift-ansible-openshift.example.com-allow_from_namespace | Give access to the services and pods on the default namespace from the other namespaces | a6e6e3aafb5c4847918c9c86074c9d1a | [] | curl to loadbalancer VIP works from port 10.11.11.11 (as expected as that port has sg_id 0c86db57-aac2-40ec-9cd3-c2de252eb291 and that is enabled by the loadbalancer SG ruels) but it also works from 10.11.12.13, which should not work as it is not allowed by any of the ingress rules at the loadbalancer SG rules (port is not from 192.168.99.0/24, nor from 172.30.0.0/16, and it does not include sg ids aeb7071e-9af4-4383-b8b9-7ac7c8b36c40, nor 0c86db57-aac2-40ec-9cd3-c2de252eb291
Succeeded to reproduce it again. It happened on OSP14 -p 2019-02-22.2
Genadi, Can you share what procedure you followed to reproduce this on OSP 14? Thanks, Nate
(In reply to Nate Johnston from comment #6) > Genadi, > > Can you share what procedure you followed to reproduce this on OSP 14? > > Thanks, > > Nate Just to give a bit more context... The way OCP defines namespace isolation is the next: - Pods in a namespace can only reach other pods in their on namespace plus the ones on the default namespace - Pods in a namespace can only be reached by other pods in their own namespace plus the ones on the default namespace. In kuryr we are translating that into OpenStack security groups, so, we create an specific SG group for each namespace and attach them to the pods in that namespace. This SG will allow all the traffic from other ports which include that SG id. For handling the default namespace exception, we are also creating a couple of extra SGs, one (SG_default) attached to the pods on the default namespace, which will allow all the traffic from ports with the other SG (SG_namespace). And the second one (SG_namespace) that is attached to all the pods but the ones on the default namespace, and which will allow all the traffic from the ports that has SG_default SG.
I used the automation test to reproduce the problem Using the test from kuryr-tempest-plugin did the job. Run the test test_namespace_sg_svc_isolation in the kuryr-tempest-plugin/kuryr_tempest_plugin/tests/scenario/test_namespace.py
Genadi, Can you let me have access to a reproducer environment? I've tried a couple of ways to reproduce this and I keep having issues, and now I'm about to lose my DSAL host. Thanks, Nate
Nate, I am trying to create an environment so you can see the problem, Will update when ready
Provided an environment to Nate as we agreed, so he can take all the logs and see the problem live
Agree with Luis to give Networking guys decide on blocker/non blocker part
Tried to reproduce for the last week without any success, so closing it for now
I'm running kuryr_tempest_plugin.tests.scenario.test_namespace.TestNamespaceScenario.test_namespace_sg_svc_isolation after running a test that restarts Neutron. I'm able to reach from the pod on one namespace the pod/service on another namespace. I gave Nate a setup to check.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0770