Bug 1985035 - Unable to connect to VMs on different computes
Summary: Unable to connect to VMs on different computes
Keywords:
Status: CLOSED DUPLICATE of bug 1952464
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-22 17:06 UTC by Maysa Macedo
Modified: 2022-08-10 16:41 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-23 11:43:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-6386 0 None None None 2022-08-10 16:41:54 UTC

Description Maysa Macedo 2021-07-22 17:06:45 UTC
Description of problem:

The connectivity to a Load Balancer VIP(Amphora VM) only works when both VMs are on the same Compute.

As an example we have 3 Pods running on hostNetworking and attempting to connect to the LB VIP:

(shiftstack) [stack@undercloud-0 ~]$ oc get po demo -n test -o wide
NAME   READY   STATUS    RESTARTS   AGE     IP            NODE                    NOMINATED NODE   READINESS GATES
demo   1/1     Running   0          4h50m   10.196.1.21   ostest-mjdr2-master-0   <none>           <none>              ------> Runs on master-0 which is on compute-0
(shiftstack) [stack@undercloud-0 ~]$ oc rsh -n test demo
sh-4.2$ curl 172.30.254.36
curl: (7) Failed connect to 172.30.254.36:80; No route to host

(shiftstack) [stack@undercloud-0 ~]$ oc get po demo-1 -n test -o wide
NAME     READY   STATUS    RESTARTS   AGE     IP             NODE                    NOMINATED NODE   READINESS GATES
demo-1   1/1     Running   0          4h50m   10.196.1.113   ostest-mjdr2-master-1   <none>           <none>          ------> Runs on master-1 which is on compute-1
(shiftstack) [stack@undercloud-0 ~]$ oc rsh -n test demo-1
sh-4.2$ curl 172.30.254.36
test: HELLO! I AM ALIVE!!!

(shiftstack) [stack@undercloud-0 ~]$ oc get po demo-2 -n test -o wide
NAME     READY   STATUS    RESTARTS   AGE     IP             NODE                    NOMINATED NODE   READINESS GATES
demo-2   1/1     Running   0          4h50m   10.196.2.133   ostest-mjdr2-master-2   <none>           <none>          ------> Runs on master-2 which is on compute-2
(shiftstack) [stack@undercloud-0 ~]$ oc rsh -n test demo-2
sh-4.2$ curl 172.30.254.36
curl: (7) Failed connect to 172.30.254.36:80; No route to host

The connectivity only worked on master-1 which is on the same compute where the load-balancer VM is.

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list |grep 172.30.254.36
| f97e8570-e883-4dab-bb19-e8f991d54f6a | c5d8d2fc-ba7d-48f9-b4e0-ea476817689b | ALLOCATED | STANDALONE | 172.24.3.217  | 172.30.254.36  |

(overcloud) [stack@undercloud-0 ~]$ openstack server list --all --long |grep 172.24.3.217
| e69ba413-f7ce-440f-96e0-888bb9db103d | amphora-f97e8570-e883-4dab-bb19-e8f991d54f6a | ACTIVE | None       | Running     | lb-mgmt-net=172.24.3.217; ostest-mjdr2-kuryr-service-network=172.31.2.45  | octavia-amphora-16.1-20210430.3.x86_64 | 1e7aafcb-6c27-4b91-8d7c-cb95e84e6d74 |             |           | nova              | compute-1.redhat.local |                                                                  |


The traffic can be seen on the namespace while attempting to connect from master-1, but the same does not happen for other masters.

[root@compute-1 ~]# ip netns exec qrouter-3325e43e-b16d-4701-aed9-3f982a61d7f4 ip r |grep 172
172.30.0.0/15 dev qr-2bd04631-e1 proto kernel scope link src 172.31.255.254

[root@compute-1 ~]# ip netns exec qrouter-3325e43e-b16d-4701-aed9-3f982a61d7f4 tcpdump -i qr-2bd04631-e1 -vvvv |grep172.31.2.45                                                             
tcpdump: listening on qr-2bd04631-e1, link-type EN10MB (Ethernet), capture size 262144 bytes
172.31.2.45.49811 > 10.128.52.44.webcache: Flags [S], cksum 0xc667 (correct), seq 4288492282, win 28200, options [mss 1410,sackOK,TS val 315162408 ecr 0,nop,wscale 4], length 0         
172.31.2.45.49811 > 10.128.52.44.webcache: Flags [.], cksum 0xb475 (correct), seq 4288492283, ack 471016337, win 1763, options [nop,nop,TS val 315162410 ecr 1037445698], length 0       
172.31.2.45.49811 > 10.128.52.44.webcache: Flags [P.], cksum 0x4bdd (correct), seq 0:77, ack 1, win 1763, options [nop,nop,TS val 315162410 ecr 1037445698], length 77: HTTP, length: 77 
172.31.2.45.49811 > 10.128.52.44.webcache: Flags [.], cksum 0xed1e (incorrect -> 0xb351), seq 77, ack 145, win 1830, options [nop,nop,TS val 315162412 ecr 1037445700], length 0         
172.31.2.45.49811 > 10.128.52.44.webcache: Flags [F.], cksum 0xed1e (incorrect -> 0xb34f), seq 77, ack 145, win 1830, options [nop,nop,TS val 315162413 ecr 1037445700], length 0        
172.31.2.45.49811 > 10.128.52.44.webcache: Flags [.], cksum 0xed1e (incorrect -> 0xb34c), seq 78, ack 146, win 1830, options [nop,nop,TS val 315162413 ecr 1037445702], length 0         
16:33:41.643328 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has compute-1 tell 172.31.2.45, length 28                                                                                   
16:33:41.773601 ARP, Ethernet (len 6), IPv4 (len 4), Reply 172.31.2.45 is-at fa:16:3e:41:6c:0f (oui Unknown), length 28                                                                      
16:33:42.797567 ARP, Ethernet (len 6), IPv4 (len 4), Reply 172.31.2.45 is-at fa:16:3e:41:6c:0f (oui Unknown), length 28                                                                      
16:33:43.820513 ARP, Ethernet (len 6), IPv4 (len 4), Reply 172.31.2.45 is-at fa:16:3e:41:6c:0f (oui Unknown), length 28

(10.128.52.44) is the load-balancer member

(shiftstack) [stack@undercloud-0 ~]$ . overcloudrc ; openstack server list --project shiftstack --long                                                                                       
+--------------------------------------+------------------------+--------+------------+-------------+------------------------------------------------+--------------------+--------------------------------------+-------------+-----------+-------------------+------------------------+------------------------------------------------------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks                                       | Image Name         | Image ID                             | Flavor Name | Flavor ID | Availability Zone | Host                   | Properties                                                       |
+--------------------------------------+------------------------+--------+------------+-------------+------------------------------------------------+--------------------+--------------------------------------+-------------+-----------+-------------------+------------------------+------------------------------------------------------------------+
| 14a9cd9d-f1ed-48f6-b4f4-c189fa2b4c84 | ostest-mjdr2-master-2  | ACTIVE | None       | Running     | ostest-mjdr2-openshift=10.196.2.133            | ostest-mjdr2-rhcos | d9ac7703-f885-4be9-9fbc-4052899a3c32 |             |           | nova              | compute-2.redhat.local | Name='ostest-mjdr2-master', openshiftClusterID='ostest-mjdr2'    |
| 31badc86-619c-437f-846c-ccc54bc4c002 | ostest-mjdr2-master-1  | ACTIVE | None       | Running     | ostest-mjdr2-openshift=10.196.1.113            | ostest-mjdr2-rhcos | d9ac7703-f885-4be9-9fbc-4052899a3c32 |             |           | nova              | compute-1.redhat.local | Name='ostest-mjdr2-master', openshiftClusterID='ostest-mjdr2'    |
| 97054bad-8c53-4b91-982f-d6a31c7739dd | ostest-mjdr2-bootstrap | ACTIVE | None       | Running     | ostest-mjdr2-openshift=10.196.0.96, 10.0.0.244 | ostest-mjdr2-rhcos | d9ac7703-f885-4be9-9fbc-4052899a3c32 |             |           | nova              | compute-2.redhat.local | Name='ostest-mjdr2-bootstrap', openshiftClusterID='ostest-mjdr2' |
| bcaa6808-7d14-4623-8a40-a036e754e607 | ostest-mjdr2-master-0  | ACTIVE | None       | Running     | ostest-mjdr2-openshift=10.196.1.21             | ostest-mjdr2-rhcos | d9ac7703-f885-4be9-9fbc-4052899a3c32 |             |           | nova              | compute-0.redhat.local | Name='ostest-mjdr2-master', openshiftClusterID='ostest-mjdr2'    |
+--------------------------------------+------------------------+--------+------------+-------------+------------------------------------------------+--------------------+--------------------------------------+-------------+-----------+-------------------+------------------------+------------------------------------------------------------------+

[root@compute-0 ~]# cat /var/lib/config-data/puppet-generated/neutron/etc/neutron/l3_agent.ini|grep dvr
agent_mode=dvr

[root@compute-1 ~] # cat /var/lib/config-data/puppet-generated/neutron/etc/neutron/l3_agent.ini |grep dvr
agent_mode=dvr

# virsh list
 Id   Name           State
------------------------------
 7    undercloud-0   running
 19   compute-1      running
 20   compute-2      running
 21   compute-0      running
 23   controller-0   running
 24   controller-1   running
 25   controller-2   running

Version-Release number of selected component (if applicable):

$ cat /etc/rhosp-release 
Red Hat OpenStack Platform release 16.1.6 GA (Train) **with ovs**

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:


Note You need to log in before you can comment on or make changes to this bug.