Bug 2025949

Summary: Enabling PROXY protocol on the ingress controller doesn't work when the cluster has been updated from previous version.
Product: OpenShift Container Platform Reporter: Alfredo Pizarro <apizarro>
Component: NetworkingAssignee: aos-network-edge-staff <aos-network-edge-staff>
Networking sub component: router QA Contact: Arvind iyengar <aiyengar>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: unspecified CC: aos-bugs, bpickard, hongli, mmasters
Version: 4.7   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-24 19:34:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alfredo Pizarro 2021-11-23 13:16:20 UTC
Description of problem:

A customer needs to configure the ingress controllers to use the PROXY protocol so they can whitelist ip addresses in their routes. The customer followed the steps outlined in the documentation (https://docs.openshift.com/container-platform/4.8/networking/ingress-operator.html#nw-ingress-controller-configuration-proxy-protocol_configuring-ingress) but once they configure the ingresscontroller, the options is not configured in the router-default deployment or in the pods. 

After some investigation I could reproduce and I could validate the issue is present only when the cluster have been updated from a previous version. If the cluster is a fresh 4.8 cluster, the configuration works:

OCP UPI cluster upgraded from 4.7.36 to 4.8.18:

Ingresscontroller config:
========================
---
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
  creationTimestamp: "2021-11-20T20:15:55Z"
  finalizers:
  - ingresscontroller.operator.openshift.io/finalizer-ingresscontroller
  generation: 2 <--- the change in the controller was saved
  name: default
  namespace: openshift-ingress-operator
  resourceVersion: "910871"
  uid: d49258ea-3af4-4820-9ffd-23bf23d37f6e
spec:
  endpointPublishingStrategy:
    hostNetwork:
      protocol: PROXY
    type: HostNetwork
  replicas: 2

Deployment  not showing the ROUTER_USE_PROXY_PROTOCOL variable
==============================================
$ omg get deployment router-default -o yaml |grep -i proxy
          value: haproxy
          value: /var/lib/haproxy/conf/metrics-auth/statsPassword
          value: /var/lib/haproxy/conf/metrics-auth/statsUsername
        - mountPath: /var/lib/haproxy/conf/metrics-auth


Fresh OCP UPI 4.8.18 cluster:

Ingresscontroller config:
========================
apiVersion: v1
items:
- apiVersion: operator.openshift.io/v1
  kind: IngressController
  metadata:
    creationTimestamp: "2021-11-22T14:20:55Z"
    finalizers:
    - ingresscontroller.operator.openshift.io/finalizer-ingresscontroller
    generation: 2 <-- configuration saved.
    name: default
    namespace: openshift-ingress-operator
    resourceVersion: "148518"
    uid: d60ab0d3-e4a5-4c98-a8ac-fe3cced4fcd1
  spec:
    endpointPublishingStrategy:
      hostNetwork:
        protocol: PROXY
      type: HostNetwork
    httpErrorCodePages:
      name: ""
    replicas: 2
    tuningOptions: {}
    unsupportedConfigOverrides: null


Deployment and pods show the ROUTER_USE_PROXY_PROTOCOL variable set to true:
===========================================
$ oc get deployment router-default -o yaml |grep ROUTER_USE_PROXY_PROTOCOL -A1
        - name: ROUTER_USE_PROXY_PROTOCOL
          value: "true"

$ oc get pod router-default-55b7fb778f-9gsm6 -o yaml | grep ROUTER_USE_PROXY_PROTOCOL -A1
    - name: ROUTER_USE_PROXY_PROTOCOL
      value: "true"


Version-Release number of selected component (if applicable):
4.8.18

How reproducible:
In a upi ocp cluster that has been upgraded to 4.8 from a previous version, enable the PROXY protocol in the default ingress controller. 


Steps to Reproduce:
1. Enable the option in the default ingress controller.
2. Check that the protocol: PROXY has been saved in the configuration.
3.

Actual results:
- The option is not applied in the router-default deployment.

Expected results:
- The option should work and we should see the following environment variable in the deployment:

    - name: ROUTER_USE_PROXY_PROTOCOL
      value: "true"


Additional info:

Comment 2 Alfredo Pizarro 2021-11-24 13:31:18 UTC
I've been checking the changes in the two clusters and I see they are using the same ingress-operator image:

Cluster upgraded to 4.8.18
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4554a00a28eb7768e90bf3ecc2a8933e325c10a96d3b0326ecfbdabee076aca8

Fresh 4.8.18 cluster
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4554a00a28eb7768e90bf3ecc2a8933e325c10a96d3b0326ecfbdabee076aca8


When the PROXY protocol is enabled on the upgraded cluster, I see the generation number changes and the settings are saved:

---
spec:
  endpointPublishingStrategy:
    hostNetwork:
      protocol: PROXY
    type: HostNetwork
  replicas: 2
---

However, a "updated router deployment" is not seen in the ingress-operator pod logs and the router-default deployment is not updated. But If I change another setting like replicas, the "updated router deployment" is seen but the PROXY setting is not seen in the ingres-operator logs:

2021-11-24T13:20:20.657Z	INFO	operator.ingress_controller	ingress/deployment.go:103	updated router deployment	{"namespace": "openshift-ingress", "name": "router-default", "diff": "  &v1.Deployment{\n  \tTypeMeta:   {},\n  \tObjectMeta: {Name: \"router-default\", Namespace: \"openshift-ingress\", UID: \"5ebcf5e4-5de0-451d-b537-30c8666dd856\", ResourceVersion: \"259866\", ...},\n  \tSpec: v1.DeploymentSpec{\n- \t\tReplicas: &2,\n+ \t\tReplicas: &1,\n  \t\tSelector: &{MatchLabels: {\"ingresscontroller.operator.openshift.io/deployment-ingresscontroller\": \"default\"}},\n  \t\tTemplate: v1.PodTemplateSpec{\n  \t\t\tObjectMeta: {Labels: {\"ingresscontroller.operator.openshift.io/deployment-ingresscontroller\": \"default\", \"ingresscontroller.operator.openshift.io/hash\": \"56979c4fb9\"}, Annotations: {\"target.workload.openshift.io/management\": `{\"effect\": \"PreferredDuringScheduling\"}`, \"unsupported.do-not-use.openshift.io/override-liveness-grace-period-seconds\": \"10\"}},\n  \t\t\tSpec: v1.PodSpec{\n  \t\t\t\tVolumes: []v1.Volume{\n  \t\t\t\t\t{Name: \"default-certificate\", VolumeSource: {Secret: &{SecretName: \"router-certs-default\", DefaultMode: &420}}},\n  \t\t\t\t\t{\n  \t\t\t\t\t\tName: \"service-ca-bundle\",\n  \t\t\t\t\t\tVolumeSource: v1.VolumeSource{\n  \t\t\t\t\t\t\t... // 16 identical fields\n  \t\t\t\t\t\t\tFC:        nil,\n  \t\t\t\t\t\t\tAzureFile: nil,\n  \t\t\t\t\t\t\tConfigMap: &v1.ConfigMapVolumeSource{\n  \t\t\t\t\t\t\t\tLocalObjectReference: {Name: \"service-ca-bundle\"},\n  \t\t\t\t\t\t\t\tItems:                {{Key: \"service-ca.crt\", Path: \"service-ca.crt\"}},\n- \t\t\t\t\t\t\t\tDefaultMode:          &420,\n+ \t\t\t\t\t\t\t\tDefaultMode:          nil,\n  \t\t\t\t\t\t\t\tOptional:             &false,\n  \t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\tVsphereVolume: nil,\n  \t\t\t\t\t\t\tQuobyte:       nil,\n  \t\t\t\t\t\t\t... // 8 identical fields\n  \t\t\t\t\t\t},\n  \t\t\t\t\t},\n  \t\t\t\t\t{\n  \t\t\t\t\t\tName: \"stats-auth\",\n  \t\t\t\t\t\tVolumeSource: v1.VolumeSource{\n  \t\t\t\t\t\t\t... // 3 identical fields\n  \t\t\t\t\t\t\tAWSElasticBlockStore: nil,\n  \t\t\t\t\t\t\tGitRepo:              nil,\n  \t\t\t\t\t\t\tSecret: &v1.SecretVolumeSource{\n  \t\t\t\t\t\t\t\tSecretName:  \"router-stats-default\",\n  \t\t\t\t\t\t\t\tItems:       nil,\n- \t\t\t\t\t\t\t\tDefaultMode: &420,\n+ \t\t\t\t\t\t\t\tDefaultMode: nil,\n  \t\t\t\t\t\t\t\tOptional:    nil,\n  \t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\tNFS:   nil,\n  \t\t\t\t\t\t\tISCSI: nil,\n  \t\t\t\t\t\t\t... // 21 identical fields\n  \t\t\t\t\t\t},\n  \t\t\t\t\t},\n  \t\t\t\t\t{\n  \t\t\t\t\t\tName: \"metrics-certs\",\n  \t\t\t\t\t\tVolumeSource: v1.VolumeSource{\n  \t\t\t\t\t\t\t... // 3 identical fields\n  \t\t\t\t\t\t\tAWSElasticBlockStore: nil,\n  \t\t\t\t\t\t\tGitRepo:              nil,\n  \t\t\t\t\t\t\tSecret: &v1.SecretVolumeSource{\n  \t\t\t\t\t\t\t\tSecretName:  \"router-metrics-certs-default\",\n  \t\t\t\t\t\t\t\tItems:       nil,\n- \t\t\t\t\t\t\t\tDefaultMode: &420,\n+ \t\t\t\t\t\t\t\tDefaultMode: nil,\n  \t\t\t\t\t\t\t\tOptional:    nil,\n  \t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\tNFS:   nil,\n  \t\t\t\t\t\t\tISCSI: nil,\n  \t\t\t\t\t\t\t... // 21 identical fields\n  \t\t\t\t\t\t},\n  \t\t\t\t\t},\n  \t\t\t\t},\n  \t\t\t\tInitContainers: nil,\n  \t\t\t\tContainers: []v1.Container{\n  \t\t\t\t\t{\n  \t\t\t\t\t\t... // 9 identical fields\n  \t\t\t\t\t\tVolumeMounts:  {{Name: \"default-certificate\", ReadOnly: true, MountPath: \"/etc/pki/tls/private\"}, {Name: \"service-ca-bundle\", ReadOnly: true, MountPath: \"/var/run/configmaps/service-ca\"}, {Name: \"stats-auth\", ReadOnly: true, MountPath: \"/var/lib/haproxy/conf/metrics-auth\"}, {Name: \"metrics-certs\", ReadOnly: true, MountPath: \"/etc/pki/tls/metrics-certs\"}},\n  \t\t\t\t\t\tVolumeDevices: nil,\n  \t\t\t\t\t\tLivenessProbe: &v1.Probe{\n  \t\t\t\t\t\t\tHandler: v1.Handler{\n  \t\t\t\t\t\t\t\tExec: nil,\n  \t\t\t\t\t\t\t\tHTTPGet: &v1.HTTPGetAction{\n  \t\t\t\t\t\t\t\t\tPath:        \"/healthz\",\n  \t\t\t\t\t\t\t\t\tPort:        {IntVal: 1936},\n  \t\t\t\t\t\t\t\t\tHost:        \"localhost\",\n- \t\t\t\t\t\t\t\t\tScheme:      \"HTTP\",\n+ \t\t\t\t\t\t\t\t\tScheme:      \"\",\n  \t\t\t\t\t\t\t\t\tHTTPHeaders: nil,\n  \t\t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\t\tTCPSocket: nil,\n  \t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\tInitialDelaySeconds:           0,\n- \t\t\t\t\t\t\tTimeoutSeconds:                1,\n+ \t\t\t\t\t\t\tTimeoutSeconds:                0,\n- \t\t\t\t\t\t\tPeriodSeconds:                 10,\n+ \t\t\t\t\t\t\tPeriodSeconds:                 0,\n- \t\t\t\t\t\t\tSuccessThreshold:              1,\n+ \t\t\t\t\t\t\tSuccessThreshold:              0,\n- \t\t\t\t\t\t\tFailureThreshold:              3,\n+ \t\t\t\t\t\t\tFailureThreshold:              0,\n  \t\t\t\t\t\t\tTerminationGracePeriodSeconds: nil,\n  \t\t\t\t\t\t},\n  \t\t\t\t\t\tReadinessProbe: &v1.Probe{\n  \t\t\t\t\t\t\tHandler: v1.Handler{\n  \t\t\t\t\t\t\t\tExec: nil,\n  \t\t\t\t\t\t\t\tHTTPGet: &v1.HTTPGetAction{\n  \t\t\t\t\t\t\t\t\tPath:        \"/healthz/ready\",\n  \t\t\t\t\t\t\t\t\tPort:        {IntVal: 1936},\n  \t\t\t\t\t\t\t\t\tHost:        \"localhost\",\n- \t\t\t\t\t\t\t\t\tScheme:      \"HTTP\",\n+ \t\t\t\t\t\t\t\t\tScheme:      \"\",\n  \t\t\t\t\t\t\t\t\tHTTPHeaders: nil,\n  \t\t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\t\tTCPSocket: nil,\n  \t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\tInitialDelaySeconds:           0,\n- \t\t\t\t\t\t\tTimeoutSeconds:                1,\n+ \t\t\t\t\t\t\tTimeoutSeconds:                0,\n- \t\t\t\t\t\t\tPeriodSeconds:                 10,\n+ \t\t\t\t\t\t\tPeriodSeconds:                 0,\n- \t\t\t\t\t\t\tSuccessThreshold:              1,\n+ \t\t\t\t\t\t\tSuccessThreshold:              0,\n- \t\t\t\t\t\t\tFailureThreshold:              3,\n+ \t\t\t\t\t\t\tFailureThreshold:              0,\n  \t\t\t\t\t\t\tTerminationGracePeriodSeconds: nil,\n  \t\t\t\t\t\t},\n  \t\t\t\t\t\tStartupProbe: &v1.Probe{\n  \t\t\t\t\t\t\tHandler: v1.Handler{\n  \t\t\t\t\t\t\t\tExec: nil,\n  \t\t\t\t\t\t\t\tHTTPGet: &v1.HTTPGetAction{\n  \t\t\t\t\t\t\t\t\tPath:        \"/healthz/ready\",\n  \t\t\t\t\t\t\t\t\tPort:        {IntVal: 1936},\n  \t\t\t\t\t\t\t\t\tHost:        \"localhost\",\n- \t\t\t\t\t\t\t\t\tScheme:      \"HTTP\",\n+ \t\t\t\t\t\t\t\t\tScheme:      \"\",\n  \t\t\t\t\t\t\t\t\tHTTPHeaders: nil,\n  \t\t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\t\tTCPSocket: nil,\n  \t\t\t\t\t\t\t},\n  \t\t\t\t\t\t\tInitialDelaySeconds:           0,\n- \t\t\t\t\t\t\tTimeoutSeconds:                1,\n+ \t\t\t\t\t\t\tTimeoutSeconds:                0,\n  \t\t\t\t\t\t\tPeriodSeconds:                 1,\n- \t\t\t\t\t\t\tSuccessThreshold:              1,\n+ \t\t\t\t\t\t\tSuccessThreshold:              0,\n  \t\t\t\t\t\t\tFailureThreshold:              120,\n  \t\t\t\t\t\t\tTerminationGracePeriodSeconds: nil,\n  \t\t\t\t\t\t},\n  \t\t\t\t\t\tLifecycle:              nil,\n  \t\t\t\t\t\tTerminationMessagePath: \"/dev/termination-log\",\n  \t\t\t\t\t\t... // 6 identical fields\n  \t\t\t\t\t},\n  \t\t\t\t},\n  \t\t\t\tEphemeralContainers: nil,\n  \t\t\t\tRestartPolicy:       \"Always\",\n  \t\t\t\t... // 30 identical fields\n  \t\t\t},\n  \t\t},\n  \t\tStrategy:        {Type: \"RollingUpdate\", RollingUpdate: &{MaxUnavailable: &{Type: 1, StrVal: \"25%\"}, MaxSurge: &{}}},\n  \t\tMinReadySeconds: 30,\n  \t\t... // 3 identical fields\n  \t},\n  \tStatus: {ObservedGeneration: 2, Replicas: 2, UpdatedReplicas: 2, ReadyReplicas: 2, ...},\n  }\n"}

Comment 3 Ben Pickard 2021-11-24 14:40:49 UTC
Ingress operator and routing pods are owned by the routing team, moving to them. Feel free to shoot it back if you feel this is a networking issue

Comment 4 Miciah Dashiel Butler Masters 2021-11-24 19:34:38 UTC

*** This bug has been marked as a duplicate of bug 1997226 ***