Bug 1940667 - [OSP 16.1 ml2/ovs] - tftp communication broken through neutron router
Summary: [OSP 16.1 ml2/ovs] - tftp communication broken through neutron router
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Slawek Kaplonski
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-18 19:58 UTC by Matt Flusche
Modified: 2022-10-03 16:36 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-19 19:19:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-404 0 None None None 2022-10-03 16:36:17 UTC

Description Matt Flusche 2021-03-18 19:58:47 UTC
Description of problem:

tftp communication is broke from VM instances communicating through neutron routers.  It seems that conntrack/iptables does not function properly for tftp traffic (perhaps a rhel 8 issue).

Example communication path:

  VM instance (tftp client) -> neutron router -> external tftp server

This seems to be a regression with osp 16.1 as this works fine in an osp 13 environment.

Version-Release number of selected component (if applicable):
OSP 16.1 current (16.1.3)

How reproducible:
100% 


Steps to Reproduce:
1. generic 16.1 ml2/ovs deployment, non-dvr (probably broken with dvr also)
2. create tenant net,subnet,router, instance, floating ip
3. disable network security on the instance port to remove as troubleshooting variable.
4. from vm instance test tftp through neutron router.

Actual results:
see additional info

Expected results:
tftp should function like osp13

Additional info:


Working OSP 13 example:
=========
$ openstack server list
+--------------------------------------+-------+--------+----------------------------------+--------+---------+
| ID                                   | Name  | Status | Networks                         | Image  | Flavor  |
+--------------------------------------+-------+--------+----------------------------------+--------+---------+
| 619ae12a-3ff5-476b-93ae-eb896a052f5c | test2 | ACTIVE | net1=10.10.10.123, 192.168.2.153 | cirros | m1.tiny |
+--------------------------------------+-------+--------+----------------------------------+--------+---------+

(overcloud) [stack@undercloud13 ~]$ ssh cirros.2.153
$ tftp -l test1 -r test1 -g 192.168.0.10
$ echo $?
0
$ ls -l test1
-rw-------    1 cirros   cirros           8 Mar 18 19:46 test1
$ 

From the conntrack output we see the proper data path communication.

[root@overcloud-controller-0 ~]# ip netns exec qrouter-bf101d68-6bdd-45ee-9041-379669e1445b conntrack -L |grep udp
conntrack v1.4.4 (conntrack-tools): 3 flow entries have been shown.
udp      17 18 src=10.10.10.123 dst=192.168.0.10 sport=47306 dport=69 [UNREPLIED] src=192.168.0.10 dst=192.168.2.153 sport=69 dport=47306 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 168 src=192.168.0.10 dst=192.168.2.153 sport=50710 dport=47306 src=10.10.10.123 dst=192.168.0.10 sport=47306 dport=50710 [ASSURED] mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1

=========

OSP 16.1 
=========

We redeploy the same lab with OSP 16.1 to show the issue

(overcloud) [stack@undercloud16 ~]$ openstack server list
+--------------------------------------+-------+---------+----------------------------------+--------+--------+
| ID                                   | Name  | Status  | Networks                         | Image  | Flavor |
+--------------------------------------+-------+---------+----------------------------------+--------+--------+
| 532cbef6-a893-499b-b573-be09c95784e2 | test2 | ACTIVE  | net1=10.10.10.199, 192.168.2.157 | cirros |        |
+--------------------------------------+-------+---------+----------------------------------+--------+--------+

(overcloud) [stack@undercloud16 ~]$ ssh cirros.2.157
Warning: Permanently added '192.168.2.157' (ECDSA) to the list of known hosts.
$ tftp -g -l test1 -r test1 192.168.0.10
tftp: timeout
$ echo $?
1
$ ls -l test1
ls: test1: No such file or directory
$ 

The conntrack output from the router does not show a proper data path.

# ip netns exec qrouter-dc476d0e-62b2-46c6-b141-3b112ebfbf26 conntrack -L|grep udp
conntrack v1.4.4 (conntrack-tools): 11 flow entries have been shown.
udp      17 25 src=192.168.0.10 dst=192.168.2.157 sport=58810 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=58810 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 25 src=192.168.0.10 dst=192.168.2.157 sport=41916 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=41916 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 26 src=192.168.0.10 dst=192.168.2.157 sport=33660 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=33660 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 25 src=192.168.0.10 dst=192.168.2.157 sport=42978 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=42978 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 25 src=192.168.0.10 dst=192.168.2.157 sport=42537 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=42537 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 27 src=192.168.0.10 dst=192.168.2.157 sport=53836 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=53836 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 25 src=192.168.0.10 dst=192.168.2.157 sport=48407 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=48407 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 28 src=10.10.10.199 dst=192.168.0.10 sport=54749 dport=69 [UNREPLIED] src=192.168.0.10 dst=192.168.2.157 sport=69 dport=10272 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp      17 28 src=192.168.0.10 dst=192.168.2.157 sport=39902 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=39902 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1

This causes iptables to not perform the port translation properly resulting a tftp data port mismatch as seen via tcpdump on the instances tap interface (note the "udp port unreachable" ICMPs):


[root@overcloud-novacompute-0 ~]# tcpdump -ni tapbdb91343-ba icmp or udp                                                                                                                            
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapbdb91343-ba, link-type EN10MB (Ethernet), capture size 262144 bytes
19:44:40.735925 IP 10.10.10.199.56447 > 192.168.0.10.tftp:  22 RRQ "test1" octet tsize 0
19:44:40.740098 IP 192.168.0.10.36926 > 10.10.10.199.27976: UDP, length 10
19:44:40.740679 IP 10.10.10.199 > 192.168.0.10: ICMP 10.10.10.199 udp port 27976 unreachable, length 46                                                                                             
19:44:40.836538 IP 10.10.10.199.56447 > 192.168.0.10.tftp:  22 RRQ "test1" octet tsize 0
19:44:40.839748 IP 192.168.0.10.44240 > 10.10.10.199.27976: UDP, length 10
19:44:40.840129 IP 10.10.10.199 > 192.168.0.10: ICMP 10.10.10.199 udp port 27976 unreachable, length 46                                                                                             
19:44:40.987249 IP 10.10.10.199.56447 > 192.168.0.10.tftp:  22 RRQ "test1" octet tsize 0
19:44:40.990993 IP 192.168.0.10.49570 > 10.10.10.199.27976: UDP, length 10
19:44:40.991294 IP 10.10.10.199 > 192.168.0.10: ICMP 10.10.10.199 udp port 27976 unreachable, length 46

Comment 1 Slawek Kaplonski 2021-03-24 15:08:55 UTC
I found out that this require specific conntrack helper module to be loaded and configured in the router.
And Neutron has got API for that already. See RFE https://bugs.launchpad.net/neutron/+bug/1823633 and API ref https://docs.openstack.org/api-ref/network/v2/?expanded=create-conntrack-helper-detail#routers-conntrack-helper-ct-target-rules

So it has to be enabled in the neutron server (service_plugin conntrack_helper) and in the L3 agent (extension conntrack_helper). With those change new API should be available and You can enable such helper with API request like:

curl -X POST http://10.0.0.115:9696/v2.0/routers/26284f88-daf0-4dbb-9564-2263edd4445a/conntrack_helpers -H "Content-Type: application/json" -H "User-Agent: openstacksdk/0.36.5 keystoneauth1/3.17.3 python-requests/2.20.0 CPython/3.6.8" -H                              
 "X-Auth-Token: $token" -d '{"conntrack_helper": {"protocol": "udp", "port": 69, "helper": "tftp"}}'


Unfortunatelly we don't have OSC support for that API yet. I will propose it u/s and will track it in that BZ.

Ahh, one more thing - module "nf_nat_tftp" has to be loaded on the nodes where router is running to make that helper working. And that will not be done by Neutron so You need to load it on Your own.

Comment 3 Matt Flusche 2021-03-25 16:58:37 UTC
Hi Slawek,

Thanks for taking a look at this issue.  If I understand you correctly, the following manual steps should enable conntrack to handle tftp traffic?

- Enable the service plugin in neutron.conf (neutron server & l3 agent)

# crudini --get /var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf DEFAULT service_plugins
router,qos,segments,trunk,port_forwarding,conntrack_helper

- Restart neutron api and l3 agents.

podman restart neutron_l3_agent
podman restart neutron_api

- With debug enabled we see the service plugins loaded?

# grep service_plugins /var/log/containers/neutron/server.log
2021-03-25 15:13:44.089 6 DEBUG oslo_service.service [-] service_plugins                = ['router', 'qos', 'segments', 'trunk', 'port_forwarding', 'conntrack_helper'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2581

# grep service_plugins /var/log/containers/neutron/l3-agent.log
2021-03-25 15:52:53.022 461782 DEBUG neutron.wsgi [-] service_plugins                = ['router', 'qos', 'segments', 'trunk', 'port_forwarding', 'conntrack_helper'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2581
2021-03-25 15:52:54.639 461782 DEBUG oslo_service.service [req-32a211f7-a108-4aff-b5c2-12e39272dbe0 - - - - -] service_plugins                = ['router', 'qos', 'segments', 'trunk', 'port_forwarding', 'conntrack_helper'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2581


- enable the nf_nat_tftp kernel module on the l3 agents.

# modprobe nf_nat_tftp
# lsmod |grep tftp
nf_nat_tftp            16384  0
nf_conntrack_tftp      16384  1 nf_nat_tftp
nf_nat                 36864  6 nf_nat_ipv6,nf_nat_ipv4,xt_nat,nf_nat_tftp,openvswitch,xt_REDIRECT
nf_conntrack          155648  16 xt_conntrack,nf_conntrack_ipv6,nf_conntrack_ipv4,nf_nat,nf_conntrack_tftp,xt_state,nfnetlink_cttimeout,nf_nat_ipv6,nf_nat_ipv4,xt_nat,nf_nat_tftp,openvswitch,nf_conntrack_netlink,xt_connmark,nf_conncount,xt_REDIRECT


- enable tftp helper for router


$ curl -X POST http://192.168.2.226:9696/v2.0/routers/dc476d0e-62b2-46c6-b141-3b112ebfbf26/conntrack_helpers -H "Content-Type: application/json" -H "Use
r-Agent: openstacksdk/0.36.5 keystoneauth1/3.17.3 python-requests/2.20.0 CPython/3.6.8" -H "X-Auth-Token: $token" -d '{"conntrack_helper": {"protocol": "udp", "port": 69, "helper": "tftp
"}}'
{"conntrack_helper": {"id": "032ea860-959d-46ea-9163-3f42b9935b36", "protocol": "udp", "port": 69, "helper": "tftp"}}


- in the l3 agent log we see the helper config for the router:

/var/log/containers/neutron/l3-agent.log
2021-03-25 15:53:19.199 461782 DEBUG neutron.agent.l3.agent [req-9aca361d-ba68-4e6a-8d62-29ae07deb99f - - - - -] Processing :[{'id': 'dc476d0e-62b2-46c6-b141-3b112ebfbf26', 'name': 'router1', 'tenant_id': '03beefe4d1cc439fa64fae5d99c1f258', 'admin_state_up': True, 'status': 'ACTIVE', 'external_gateway_info': {'network_id': '6c2dfa55-0253-4559-805d-735fcdae24f3', 'external_fixed_ips': [{'subnet_id': '229e54c7-58fa-439d-8a9f-3c41bdbf29be', 'ip_address': '192.168.2.160'}], 'enable_snat': True}, 'gw_port_id': '6bc64863-8abf-4615-a4e8-40a703d5e429', 'description': '', 'availability_zones': ['nova'], 'distributed': False, 'ha': True, 'ha_vr_id': 222, 'availability_zone_hints': [], 'routes': [], 'flavor_id': None, 'conntrack_helpers': [{'protocol': 'udp', 'port': 69, 'helper': 'tftp'}], 'tags': [], 'created_at': '2021-03-08T18:18:18Z', 'updated_at': '2021-03-25T15:22:56Z', 'revision_number': 10, 'project_id': '03beefe4d1cc4
39fa64fae5d99c1f258', 'gw_port': {'id': '6bc64863-8abf-4615-a4e8-40a703d5e429', 'name': '', 'network_id': '6c2dfa55-0253-4559-805d-735fcdae24f3', 'tenant_id': '', 'mac_address': 'fa:16:3e:00:ae:e8', 'admin_state_up': True, 'status': 'ACTIVE', 'device_id': 'dc476d0e-62b2-46c6-b141-3b112ebfbf26', 'device_owner': 'network:router_gateway', 'fixed_ips': [{'subnet_id': '229e54c7-58fa-439d-8a9f-3c41bdbf29be', 'ip_add 

However tftp still doesn't seem to work through the router.

Is there more to the config?  Any advise to troubleshoot this would be great.

Comment 5 Matt Flusche 2021-03-29 21:35:05 UTC
Thanks Slawek,

After setting the following:

# crudini --get /var/lib/config-data/puppet-generated/neutron/etc/neutron/l3_agent.ini  agent extensions
port_forwarding,conntrack_helper

I get the PREROUTING entry for tftp.

# ip netns exec qrouter-42057c7f-c5dc-449d-b86d-dbf02b2058f7 iptables -nvL -t raw

[...snip...]

Chain neutron-l3-agent-cth-4809eb7 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 CT         udp  --  *      *       0.0.0.0/0            0.0.0.0/0            udp dpt:69 CT helper tftp
        

This works for tftp clients with security groups disabled which is odd.

Is there a way to allow this helper for all new routers by default?

Comment 6 Slawek Kaplonski 2021-03-30 10:33:55 UTC
No, there is no way to set it by default for all (new) routers. You need to add it for each router.

Comment 7 Matt Flusche 2021-04-25 21:15:38 UTC
This config seems to work for an OSP 16.1 deployment.

Overcloud deployment parameters:

parameter_defaults:
  NeutronServicePlugins: 'router,qos,segments,trunk,port_forwarding,conntrack_helper'
  ControllerExtraConfig:
    neutron::agents::l3::extensions: conntrack_helper,port_forwarding
  ControllerParameters:
    ExtraKernelModules:
      nf_nat_tftp: {}


Post deployment, enable tftp conntrack helper for a router example:

$ openstack router list
+--------------------------------------+---------+--------+-------+----------------------------------+-------+
| ID                                   | Name    | Status | State | Project                          | HA    |
+--------------------------------------+---------+--------+-------+----------------------------------+-------+
| ad908033-eece-4c88-9b59-f9588657fc1c | router1 | ACTIVE | UP    | e1e667d5c61e4f7eb75731696b5b5648 | False |
+--------------------------------------+---------+--------+-------+----------------------------------+-------+


$ ROUTER=ad908033-eece-4c88-9b59-f9588657fc1c
$ token=$(openstack token issue -c id -f value)
$ neutronUrl=$(openstack catalog show neutron  -c endpoints -f json |jq -r '.endpoints[]  | select(.interface == "public") | .url')


$ curl -X POST $neutronUrl/v2.0/routers/$ROUTER/conntrack_helpers -H "Content-Type: application/json" -H "User-Agent: openstacksdk/0.36.5 keystoneauth1/3.17.3 python-requests/2.20.0 CPython/3.6.8" -H "X-Auth-Token: $token" -d '{"conntrack_helper": {"protocol": "udp", "port": 69, "helper": "tftp"}}'

{"conntrack_helper": {"id": "650d140f-a058-460f-b6d4-2fb0af6b7352", "protocol": "udp", "port": 69, "helper": "tftp"}}


Note You need to log in before you can comment on or make changes to this bug.