Bug 2229813

Summary: dhcp requests from dcn computes over l3 relay does not reach tap device in namespace during node cleanup
Product: Red Hat OpenStack Reporter: Jaison Raju <jraju>
Component: rhosp-directorAssignee: OSP Team <rhos-maint>
Status: CLOSED NOTABUG QA Contact: David Rosenfeld <drosenfe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17.1 (Wallaby)CC: bshephar, dalvarez, jraju, mburns, morazi
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-09 14:55:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jaison Raju 2023-08-07 19:26:09 UTC
Description of problem:
Nodes in dcn1 are in a different rack than the central site which is where undercloud is.
The ironic nodes once imported and brought available state tries to perform cleaning task on the node.
During this pxe process, the dhcp request reaches undercloud over l3 route, but does not reach neutron namespace or tap device.
The packets are visible on br-ctlplane. 
The ironic-inspector-dnsmasq ignores these dhcp request (as expected).
I have done introspection on the same node in the past on older puddle sucessfully. So this could be a regression.

Version-Release number of selected component (if applicable):
17.1-RHEL-9/RHOS-17.1-RHEL-9-20230628.n.2

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
dcn node's dhcp request from pxe boot during ironic cleaning process should reach the tap device on which neutron dnsmasq is listening on.

Additional info:

Comment 3 Jaison Raju 2023-08-09 14:55:55 UTC
jaison--
My bad, as per doc, what should be trying was unicast dhcp relay, which needs to forward the traffic to IP on the tap device.
19:24:19.400264 <switch gw mac> > <undercloud mac>, ethertype IPv4 (0x0800), length 590: (tos 0x0, ttl 62, id 53772, offset 0, flags [DF], proto UDP (17), length 576)                                                                     
    172.151.2.254.bootps > 172.100.1.1.bootps: [udp sum ok] BOOTP/DHCP, Request from <dcn node mac>, length 548, hops 1, xid 0x91be8bef, Flags [Broadcast] (0x8000)

or in other words the 1st IP in the range which is the IP/interface dnsmasq would use.
dhcp_start = 172.100.1.10

fixing the gateway on dhcp relay fixed the issue.
Thanks, Julia Kreger for reviewing this bz.
Thanks, Daniel for identifying the issue.