kube 1.23 introduced a breaking API change in dual-stack services which I'm just noticing now... In kube 1.21 and 1.22 (OCP 4.8 and 4.9), the apiserver would default the value of `ipFamilyPolicy` to `RequireDualStack` if you created a Service with two `ipFamilies` or two `clusterIPs` but no explicitly-specified `ipFamilyPolicy`: kind: Service apiVersion: v1 metadata: name: my-service spec: type: ClusterIP ipFamilies: - IPv6 - IPv4 ports: - port: 80 selector: foo: bar or kind: Service apiVersion: v1 metadata: name: my-service spec: type: ClusterIP clusterIPs: - 172.30.0.99 - fd02::9999 ports: - port: 80 selector: foo: bar This turned out to have some tricky and possibly unfixable broken edge cases, so in 1.23 / 4.10, you MUST explicitly specify either "ipFamilyPolicy: PreferDualStack" or "ipFamilyPolicy: RequireDualStack" for the service to be valid. (This was fallout from a MASSIVE rewrite of the apiserver Service-handling code, https://github.com/kubernetes/kubernetes/pull/96684.) It is hard to say whether any users are actually creating services in this way. Although this behavior was described in the KEP, it never appeared in the official documentation, which always implied that you had to explicitly provide an `ipFamilyPolicy` value (https://github.com/kubernetes/website/blob/release-1.22/content/en/docs/concepts/services-networking/dual-stack.md#services). (It doesn't actually say "you MUST specify ipFamilyPolicy", but it never suggests that it's possible to omit it, and doesn't describe what would happen if you did.) If we are really concerned about this as an API break, then we could add a mutating web hook to fix things, but presumably we'd have to maintain it forever. A simpler fix might be to just modify 4.8 and 4.9 to warn loudly if the user tries to create such a service? We also need to release-note this, and should explicitly mention it to known large dual-stack-using customers.
For reference, an admission webhook for services is a no-go.
Tasks: * 1. write 4.8, 4.9, 4.10 release notes * 2. add metrics and an alert in 4.9 and 4.8 when users write these objects * 3. potentially block upgrade when those objects have been written recently. We do that similarly with old API versions being used in the last day or so. For 2., use metadata.managedFields: ~~~ [root@openshift-jumpserver-0 ~]# oc create -f nginx-dualstack.yaml service/nginx-dualstack created [root@openshift-jumpserver-0 ~]# oc get svc nginx-dualstack -o yaml --show-managed-fields apiVersion: v1 kind: Service metadata: creationTimestamp: "2022-01-28T11:46:20Z" managedFields: - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:internalTrafficPolicy: {} f:ipFamilies: {} f:ports: .: {} k:{"port":80,"protocol":"TCP"}: .: {} f:port: {} f:protocol: {} f:targetPort: {} f:selector: {} f:sessionAffinity: {} f:type: {} manager: kubectl-create operation: Update time: "2022-01-28T11:46:20Z" name: nginx-dualstack namespace: openshift-nfs-storage resourceVersion: "470944" uid: 1d1d1ce1-ecc3-4806-816c-042ef9e884fe spec: clusterIP: fd02::7a6b clusterIPs: - fd02::7a6b - 172.30.218.223 internalTrafficPolicy: Cluster ipFamilies: - IPv6 - IPv4 ipFamilyPolicy: RequireDualStack ports: - port: 80 protocol: TCP targetPort: 80 selector: app: nginx sessionAffinity: None type: ClusterIP status: loadBalancer: {} [root@openshift-jumpserver-0 ~]# cat nginx-dualstack.yaml apiVersion: v1 kind: Service metadata: name: nginx-dualstack spec: type: ClusterIP selector: app: nginx # ipFamilyPolicy: RequireDualStack #ipFamilyPolicy: SingleStack ipFamilies: - IPv6 - IPv4 ports: # By default and for convenience, the `targetPort` is set to the same value as the `port` field. - port: 80 targetPort: 80 ~~~
Tested and verified in 4.9.0-0.nightly-2022-03-03-120755
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.24 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0798