Description of problem: tftp communication is broke from VM instances communicating through neutron routers. It seems that conntrack/iptables does not function properly for tftp traffic (perhaps a rhel 8 issue). Example communication path: VM instance (tftp client) -> neutron router -> external tftp server This seems to be a regression with osp 16.1 as this works fine in an osp 13 environment. Version-Release number of selected component (if applicable): OSP 16.1 current (16.1.3) How reproducible: 100% Steps to Reproduce: 1. generic 16.1 ml2/ovs deployment, non-dvr (probably broken with dvr also) 2. create tenant net,subnet,router, instance, floating ip 3. disable network security on the instance port to remove as troubleshooting variable. 4. from vm instance test tftp through neutron router. Actual results: see additional info Expected results: tftp should function like osp13 Additional info: Working OSP 13 example: ========= $ openstack server list +--------------------------------------+-------+--------+----------------------------------+--------+---------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-------+--------+----------------------------------+--------+---------+ | 619ae12a-3ff5-476b-93ae-eb896a052f5c | test2 | ACTIVE | net1=10.10.10.123, 192.168.2.153 | cirros | m1.tiny | +--------------------------------------+-------+--------+----------------------------------+--------+---------+ (overcloud) [stack@undercloud13 ~]$ ssh cirros.2.153 $ tftp -l test1 -r test1 -g 192.168.0.10 $ echo $? 0 $ ls -l test1 -rw------- 1 cirros cirros 8 Mar 18 19:46 test1 $ From the conntrack output we see the proper data path communication. [root@overcloud-controller-0 ~]# ip netns exec qrouter-bf101d68-6bdd-45ee-9041-379669e1445b conntrack -L |grep udp conntrack v1.4.4 (conntrack-tools): 3 flow entries have been shown. udp 17 18 src=10.10.10.123 dst=192.168.0.10 sport=47306 dport=69 [UNREPLIED] src=192.168.0.10 dst=192.168.2.153 sport=69 dport=47306 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 168 src=192.168.0.10 dst=192.168.2.153 sport=50710 dport=47306 src=10.10.10.123 dst=192.168.0.10 sport=47306 dport=50710 [ASSURED] mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 ========= OSP 16.1 ========= We redeploy the same lab with OSP 16.1 to show the issue (overcloud) [stack@undercloud16 ~]$ openstack server list +--------------------------------------+-------+---------+----------------------------------+--------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-------+---------+----------------------------------+--------+--------+ | 532cbef6-a893-499b-b573-be09c95784e2 | test2 | ACTIVE | net1=10.10.10.199, 192.168.2.157 | cirros | | +--------------------------------------+-------+---------+----------------------------------+--------+--------+ (overcloud) [stack@undercloud16 ~]$ ssh cirros.2.157 Warning: Permanently added '192.168.2.157' (ECDSA) to the list of known hosts. $ tftp -g -l test1 -r test1 192.168.0.10 tftp: timeout $ echo $? 1 $ ls -l test1 ls: test1: No such file or directory $ The conntrack output from the router does not show a proper data path. # ip netns exec qrouter-dc476d0e-62b2-46c6-b141-3b112ebfbf26 conntrack -L|grep udp conntrack v1.4.4 (conntrack-tools): 11 flow entries have been shown. udp 17 25 src=192.168.0.10 dst=192.168.2.157 sport=58810 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=58810 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 25 src=192.168.0.10 dst=192.168.2.157 sport=41916 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=41916 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 26 src=192.168.0.10 dst=192.168.2.157 sport=33660 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=33660 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 25 src=192.168.0.10 dst=192.168.2.157 sport=42978 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=42978 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 25 src=192.168.0.10 dst=192.168.2.157 sport=42537 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=42537 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 27 src=192.168.0.10 dst=192.168.2.157 sport=53836 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=53836 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 25 src=192.168.0.10 dst=192.168.2.157 sport=48407 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=48407 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 28 src=10.10.10.199 dst=192.168.0.10 sport=54749 dport=69 [UNREPLIED] src=192.168.0.10 dst=192.168.2.157 sport=69 dport=10272 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 28 src=192.168.0.10 dst=192.168.2.157 sport=39902 dport=10272 [UNREPLIED] src=10.10.10.199 dst=192.168.0.10 sport=10272 dport=39902 mark=67108864 secctx=system_u:object_r:unlabeled_t:s0 use=1 This causes iptables to not perform the port translation properly resulting a tftp data port mismatch as seen via tcpdump on the instances tap interface (note the "udp port unreachable" ICMPs): [root@overcloud-novacompute-0 ~]# tcpdump -ni tapbdb91343-ba icmp or udp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on tapbdb91343-ba, link-type EN10MB (Ethernet), capture size 262144 bytes 19:44:40.735925 IP 10.10.10.199.56447 > 192.168.0.10.tftp: 22 RRQ "test1" octet tsize 0 19:44:40.740098 IP 192.168.0.10.36926 > 10.10.10.199.27976: UDP, length 10 19:44:40.740679 IP 10.10.10.199 > 192.168.0.10: ICMP 10.10.10.199 udp port 27976 unreachable, length 46 19:44:40.836538 IP 10.10.10.199.56447 > 192.168.0.10.tftp: 22 RRQ "test1" octet tsize 0 19:44:40.839748 IP 192.168.0.10.44240 > 10.10.10.199.27976: UDP, length 10 19:44:40.840129 IP 10.10.10.199 > 192.168.0.10: ICMP 10.10.10.199 udp port 27976 unreachable, length 46 19:44:40.987249 IP 10.10.10.199.56447 > 192.168.0.10.tftp: 22 RRQ "test1" octet tsize 0 19:44:40.990993 IP 192.168.0.10.49570 > 10.10.10.199.27976: UDP, length 10 19:44:40.991294 IP 10.10.10.199 > 192.168.0.10: ICMP 10.10.10.199 udp port 27976 unreachable, length 46
I found out that this require specific conntrack helper module to be loaded and configured in the router. And Neutron has got API for that already. See RFE https://bugs.launchpad.net/neutron/+bug/1823633 and API ref https://docs.openstack.org/api-ref/network/v2/?expanded=create-conntrack-helper-detail#routers-conntrack-helper-ct-target-rules So it has to be enabled in the neutron server (service_plugin conntrack_helper) and in the L3 agent (extension conntrack_helper). With those change new API should be available and You can enable such helper with API request like: curl -X POST http://10.0.0.115:9696/v2.0/routers/26284f88-daf0-4dbb-9564-2263edd4445a/conntrack_helpers -H "Content-Type: application/json" -H "User-Agent: openstacksdk/0.36.5 keystoneauth1/3.17.3 python-requests/2.20.0 CPython/3.6.8" -H "X-Auth-Token: $token" -d '{"conntrack_helper": {"protocol": "udp", "port": 69, "helper": "tftp"}}' Unfortunatelly we don't have OSC support for that API yet. I will propose it u/s and will track it in that BZ. Ahh, one more thing - module "nf_nat_tftp" has to be loaded on the nodes where router is running to make that helper working. And that will not be done by Neutron so You need to load it on Your own.
Hi Slawek, Thanks for taking a look at this issue. If I understand you correctly, the following manual steps should enable conntrack to handle tftp traffic? - Enable the service plugin in neutron.conf (neutron server & l3 agent) # crudini --get /var/lib/config-data/puppet-generated/neutron/etc/neutron/neutron.conf DEFAULT service_plugins router,qos,segments,trunk,port_forwarding,conntrack_helper - Restart neutron api and l3 agents. podman restart neutron_l3_agent podman restart neutron_api - With debug enabled we see the service plugins loaded? # grep service_plugins /var/log/containers/neutron/server.log 2021-03-25 15:13:44.089 6 DEBUG oslo_service.service [-] service_plugins = ['router', 'qos', 'segments', 'trunk', 'port_forwarding', 'conntrack_helper'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2581 # grep service_plugins /var/log/containers/neutron/l3-agent.log 2021-03-25 15:52:53.022 461782 DEBUG neutron.wsgi [-] service_plugins = ['router', 'qos', 'segments', 'trunk', 'port_forwarding', 'conntrack_helper'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2581 2021-03-25 15:52:54.639 461782 DEBUG oslo_service.service [req-32a211f7-a108-4aff-b5c2-12e39272dbe0 - - - - -] service_plugins = ['router', 'qos', 'segments', 'trunk', 'port_forwarding', 'conntrack_helper'] log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2581 - enable the nf_nat_tftp kernel module on the l3 agents. # modprobe nf_nat_tftp # lsmod |grep tftp nf_nat_tftp 16384 0 nf_conntrack_tftp 16384 1 nf_nat_tftp nf_nat 36864 6 nf_nat_ipv6,nf_nat_ipv4,xt_nat,nf_nat_tftp,openvswitch,xt_REDIRECT nf_conntrack 155648 16 xt_conntrack,nf_conntrack_ipv6,nf_conntrack_ipv4,nf_nat,nf_conntrack_tftp,xt_state,nfnetlink_cttimeout,nf_nat_ipv6,nf_nat_ipv4,xt_nat,nf_nat_tftp,openvswitch,nf_conntrack_netlink,xt_connmark,nf_conncount,xt_REDIRECT - enable tftp helper for router $ curl -X POST http://192.168.2.226:9696/v2.0/routers/dc476d0e-62b2-46c6-b141-3b112ebfbf26/conntrack_helpers -H "Content-Type: application/json" -H "Use r-Agent: openstacksdk/0.36.5 keystoneauth1/3.17.3 python-requests/2.20.0 CPython/3.6.8" -H "X-Auth-Token: $token" -d '{"conntrack_helper": {"protocol": "udp", "port": 69, "helper": "tftp "}}' {"conntrack_helper": {"id": "032ea860-959d-46ea-9163-3f42b9935b36", "protocol": "udp", "port": 69, "helper": "tftp"}} - in the l3 agent log we see the helper config for the router: /var/log/containers/neutron/l3-agent.log 2021-03-25 15:53:19.199 461782 DEBUG neutron.agent.l3.agent [req-9aca361d-ba68-4e6a-8d62-29ae07deb99f - - - - -] Processing :[{'id': 'dc476d0e-62b2-46c6-b141-3b112ebfbf26', 'name': 'router1', 'tenant_id': '03beefe4d1cc439fa64fae5d99c1f258', 'admin_state_up': True, 'status': 'ACTIVE', 'external_gateway_info': {'network_id': '6c2dfa55-0253-4559-805d-735fcdae24f3', 'external_fixed_ips': [{'subnet_id': '229e54c7-58fa-439d-8a9f-3c41bdbf29be', 'ip_address': '192.168.2.160'}], 'enable_snat': True}, 'gw_port_id': '6bc64863-8abf-4615-a4e8-40a703d5e429', 'description': '', 'availability_zones': ['nova'], 'distributed': False, 'ha': True, 'ha_vr_id': 222, 'availability_zone_hints': [], 'routes': [], 'flavor_id': None, 'conntrack_helpers': [{'protocol': 'udp', 'port': 69, 'helper': 'tftp'}], 'tags': [], 'created_at': '2021-03-08T18:18:18Z', 'updated_at': '2021-03-25T15:22:56Z', 'revision_number': 10, 'project_id': '03beefe4d1cc4 39fa64fae5d99c1f258', 'gw_port': {'id': '6bc64863-8abf-4615-a4e8-40a703d5e429', 'name': '', 'network_id': '6c2dfa55-0253-4559-805d-735fcdae24f3', 'tenant_id': '', 'mac_address': 'fa:16:3e:00:ae:e8', 'admin_state_up': True, 'status': 'ACTIVE', 'device_id': 'dc476d0e-62b2-46c6-b141-3b112ebfbf26', 'device_owner': 'network:router_gateway', 'fixed_ips': [{'subnet_id': '229e54c7-58fa-439d-8a9f-3c41bdbf29be', 'ip_add However tftp still doesn't seem to work through the router. Is there more to the config? Any advise to troubleshoot this would be great.
Thanks Slawek, After setting the following: # crudini --get /var/lib/config-data/puppet-generated/neutron/etc/neutron/l3_agent.ini agent extensions port_forwarding,conntrack_helper I get the PREROUTING entry for tftp. # ip netns exec qrouter-42057c7f-c5dc-449d-b86d-dbf02b2058f7 iptables -nvL -t raw [...snip...] Chain neutron-l3-agent-cth-4809eb7 (1 references) pkts bytes target prot opt in out source destination 0 0 CT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:69 CT helper tftp This works for tftp clients with security groups disabled which is odd. Is there a way to allow this helper for all new routers by default?
No, there is no way to set it by default for all (new) routers. You need to add it for each router.
This config seems to work for an OSP 16.1 deployment. Overcloud deployment parameters: parameter_defaults: NeutronServicePlugins: 'router,qos,segments,trunk,port_forwarding,conntrack_helper' ControllerExtraConfig: neutron::agents::l3::extensions: conntrack_helper,port_forwarding ControllerParameters: ExtraKernelModules: nf_nat_tftp: {} Post deployment, enable tftp conntrack helper for a router example: $ openstack router list +--------------------------------------+---------+--------+-------+----------------------------------+-------+ | ID | Name | Status | State | Project | HA | +--------------------------------------+---------+--------+-------+----------------------------------+-------+ | ad908033-eece-4c88-9b59-f9588657fc1c | router1 | ACTIVE | UP | e1e667d5c61e4f7eb75731696b5b5648 | False | +--------------------------------------+---------+--------+-------+----------------------------------+-------+ $ ROUTER=ad908033-eece-4c88-9b59-f9588657fc1c $ token=$(openstack token issue -c id -f value) $ neutronUrl=$(openstack catalog show neutron -c endpoints -f json |jq -r '.endpoints[] | select(.interface == "public") | .url') $ curl -X POST $neutronUrl/v2.0/routers/$ROUTER/conntrack_helpers -H "Content-Type: application/json" -H "User-Agent: openstacksdk/0.36.5 keystoneauth1/3.17.3 python-requests/2.20.0 CPython/3.6.8" -H "X-Auth-Token: $token" -d '{"conntrack_helper": {"protocol": "udp", "port": 69, "helper": "tftp"}}' {"conntrack_helper": {"id": "650d140f-a058-460f-b6d4-2fb0af6b7352", "protocol": "udp", "port": 69, "helper": "tftp"}}