Bug 2027477
| Summary: | ipvlan-dhcp configuration: Pod stuck in 'ContainerCreating' state with 'error calling DHCP.Allocate: no more tries' | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Julie <mjulie> | ||||||
| Component: | Documentation | Assignee: | Mike McKiernan <mmckiern> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Weibin Liang <weliang> | ||||||
| Severity: | medium | Docs Contact: | Vikram Goyal <vigoyal> | ||||||
| Priority: | medium | ||||||||
| Version: | 4.10 | CC: | aos-bugs, dorzel, mjtarsel, mmckiern, mtarsel, pdsilva, weliang | ||||||
| Target Milestone: | --- | Keywords: | Reopened | ||||||
| Target Release: | --- | ||||||||
| Hardware: | ppc64le | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2022-01-12 12:45:54 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Julie
2021-11-29 18:27:22 UTC
Thanks for providing the given steps. From my initial analysis it looks like it's possible that the DHCP daemon you're using (looks like dnsmasq) isn't on the same broadcast domain as the IPVLAN networks. Typically, when we see that the dhcp-daemon-* pods are running on each node, and then you get a failure with `error calling DHCP.Allocate: no more tries` -- this means that the DHCP traffic hasn't been seen on the interface that's been plumbed, e.g. that interface that would be created for your `k8s.v1.cni.cncf.io/networks: ipvlan-dhcp` Additionally, it looks like you have a virtual interface: ipvlan-dhcp@env2 And then you're using the master interface for your ipvlan secondary network with a master value like so: "master": "env2" And in this case, I believe that the interface ipvlan-dhcp@env2 and the ipvlan network you're creating that has a master of env2 will not be on the same broadcast domain, so... This is why the DHCP service isn't seen on the ipvlan additional interface in pods that uses env2 as a master interface. You may be able to use bridging to bring these networks together, otherwise, though. Unfortunately, there's no support in OCP for a native DHCP server for secondary networks, and DHCP for secondary network relies on external DHCP (e.g. bring-your-own DHCP) (In reply to Douglas Smith from comment #1) > > Additionally, it looks like you have a virtual interface: > > ipvlan-dhcp@env2 > > And then you're using the master interface for your ipvlan secondary network > with a master value like so: > > "master": "env2" > > And in this case, I believe that the interface ipvlan-dhcp@env2 and the > ipvlan network you're creating that has a master of env2 will not be on the > same broadcast domain, so... This is why the DHCP service isn't seen on the > ipvlan additional interface in pods that uses env2 as a master interface. > > Tried again with the CNO master value set to 'ipvlan-dhcp0' (virtual interface on worker-0 node), but the issue persists. Detailed steps are provided in the attached 'ipvlan-issue-recreated.txt' file. Could you please take a look. Thanks. Created attachment 1844197 [details]
Issue recreated
After more discussion, and thanks to excellent notes from Weibin in QE, we determined that this is a documentation bug. Essentially, IPVLAN + DHCP is problematic because IPVLAN interfaces share the MAC address with the host interface. It's our assessment that the IPVLAN + DHCP configuration should be removed from documentation. Julie -- thanks for the report, attention to detail, and our apologies for burning cycles on a documentation mistake. @weliang , PTAL: https://github.com/openshift/openshift-docs/pull/39718 The documentation was reorganized after Julie opened this BZ, so the URLs Julie provided are 404s now. (In reply to Mike McKiernan from comment #8) > @weliang , PTAL: > https://github.com/openshift/openshift-docs/pull/39718 > > The documentation was reorganized after Julie opened this BZ, so the URLs > Julie provided are 404s now. Here is the updated doc: https://docs.openshift.com/container-platform/4.9/networking/multiple_networks/configuring-additional-network.html#nw-multus-ipvlan-object_configuring-additional-network https://github.com/openshift/openshift-docs/pull/39718 looks good to me. I reviewed the updated doc -> https://docs.openshift.com/container-platform/4.9/networking/multiple_networks/configuring-additional-network.html#nw-multus-ipvlan-object_configuring-additional-network Changes look good, however the "ipvlan configuration example" section still shows 'dhcp' as the ipam type. Need to remove this. ipam": { "type": "dhcp" } I have no explanation for overlooking the obvious change that Julie kindly pointed out a second time. PR: https://github.com/openshift/openshift-docs/pull/40269 Julie, please check at your earliest convenience and thank you again. (In reply to Mike McKiernan from comment #12) > > PR: https://github.com/openshift/openshift-docs/pull/40269 > > Julie, please check at your earliest convenience and thank you again. The changes look good to me. Thanks. |