Description of problem: Using multus, you can assign IPv6 to 2nd device of a POD. When assigning ip with '00fd:xx' as below, it fails with message[a]. [a]'... AddNetwork: netplugin failed with no error message' ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ { "cniVersion": "0.3.1", "name": "cos-ipv6", "type": "macvlan", "master": "ens3", "mode": "bridge", "ipam": { "type": "whereabouts", "datastore": "kubernetes", "kubernetes": { "kubeconfig": "/etc/kubernetes/cni/net.d/whereabouts.d/whereabouts.kubeconfig" }, "range": "fd:100:200:30:371::/112", "range_start": "fd:100:200:30:371::0", "range_end": "fd:100:200:30:371::ffff", "gateway": "fd:100:200:30:371::1" } } >>>> Type Reason Age From Message ---- ------ ---- ---- ------- Normal AddedInterface 114m multus Add eth0 [10.129.2.70/23] Warning FailedCreatePodSandBox 114m kubelet, worker03.ss.samsung.local Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_mariadb-0_cos_b99ca130-5e6a-431d-9ac4-06746b5ad56a_0(7ebf8457a5ce2b866b12d32e3ba61d3842993287868deac2dde5b33c8a6ecb58): Multus: [cos/mariadb-0]: error adding container to network "cos-ipv6": delegateAdd: error invoking confAdd - "macvlan": error in getting result from AddNetwork: netplugin failed with no error message ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ However, when assigning ip with 'fd:00:xx' instead as below, everything is fine. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "range": "fd00:100:200:30:371::/112", "range_start": "fd00:100:200:30:371::0", "range_end": "fd00:100:200:30:371::ffff", "gateway": "fd00:100:200:30:371::1" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As message is 'no error message', so we can not see why it fails. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. assign the ipv6 mentioned above 2. 3. Actual results: it fails with 'no error message' Expected results: We are able to see why it fails, or the ip should be assigned. Additional info: It seems ipv6 range '0000::/8', which '00fd:..' belongs to, are "Reserved by IETF". - https://www.iana.org/assignments/ipv6-address-space/ipv6-address-space.xhtml - https://tools.ietf.org/html/rfc3513#section-2.5: Does OCP4 implements the IETF reserved space ?
I've been able to replicate the error reliably, thanks for the information to help me reproduce. My initial findings are that when the first/most significant hextet of a IPv6 is shortened (e.g. leading zeroes are removed) in a Whereabouts Range, it produces this error. A work around -- you can allocate this address if you have a value in the most-significant byte of the IPv6 address -- as you have observed, so for example, this configuration should work: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ { "cniVersion": "0.3.1", "name": "cos-ipv6", "type": "macvlan", "master": "ens3", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "fd00:100:200:30:371::/112", "gateway": "fd:100:200:30:371::1" } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Important notes: 1. Whereabouts IP Address Management CNI plugin is a tech preview in OCP 4.5, so, backports of fixes are unlikely for 4.5 2. Newer versions of the Whereabouts plugin do show more information about a failure. 3. Note in my example I removed the `datastore` and `kubernetes` fields, these are not necessary in Whereabouts in OpenShift (and most recent versions of the upstream project) 4. I removed the `range_start` and `range_end` these are not supported fields in Whereabouts (may have been copied from host-local configuration, which uses these values) I also have filed an upstream issue @ https://github.com/dougbtv/whereabouts/issues/84 In theory, Whereabouts could allocate addresses in the reserved '0000::/8' but this is due to a technical limitation, only testing was performed outside of this space.
regarding #2 above, the error message now reads as: ``` panic: runtime error: index out of range [16] with length 16 goroutine 1 [running]: github.com/dougbtv/whereabouts/pkg/allocate.BigIntToIP(0xc0002e6700, 0xc0000ebb30, 0x2, 0x6, 0x4, 0x4, 0xc0002c3200) \t/go/src/github.com/dougbtv/whereabouts/pkg/allocate/allocate.go:235 +0x15e ``` Which was useful for me to debug, however, if it's not actionable by a customer, it's due to a limitation in the application which wasn't properly tested for or accounted for.
To validate this fix, it should be as simple as using any IPv6 range that starts with a zero, for example: `fd::1/116` would cause Whereabouts to error if the bug is present, if an IPv6 address is assigned, the fix should work. Thanks!
[weliang@weliang ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-03-01-070854 True False 12m Cluster version is 4.8.0-0.nightly-2021-03-01-070854 [weliang@weliang ~]$ oc get net-attach-def macvlan-bridge-whereabouts-v6 -o yaml apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: creationTimestamp: "2021-03-01T16:05:55Z" generation: 1 managedFields: - apiVersion: k8s.cni.cncf.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:config: {} manager: kubectl-create operation: Update time: "2021-03-01T16:05:55Z" name: macvlan-bridge-whereabouts-v6 namespace: test resourceVersion: "32659" uid: da678a38-e1c4-4d14-a3bf-4118db1cef94 spec: config: '{ "cniVersion": "0.3.0", "name": "whereabouts", "type": "macvlan", "mode": "bridge", "ipam": { "type": "whereabouts", "range": "fd:dead:beef:1::1-fd:dead:beef:1::4/64" } }' [weliang@weliang ~]$ oc exec whereabouts-podv6-1 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 3: eth0@if20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8901 qdisc noqueue state UP group default link/ether 0a:58:0a:80:08:0d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.128.8.13/23 brd 10.128.9.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::858:aff:fe80:80d/64 scope link valid_lft forever preferred_lft forever 4: net1@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default link/ether da:07:db:9d:b9:d4 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fd:dead:beef:1::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::d807:dbff:fe9d:b9d4/64 scope link valid_lft forever preferred_lft forever [weliang@weliang ~]$
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438
Added QE test coverage for this bug: https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-44941