Bug 1751903
Summary: | The default proxy of the pods cannot be overwritten by a specific empty value | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jian Zhang <jiazha> | |
Component: | OLM | Assignee: | Evan Cordell <ecordell> | |
OLM sub component: | OLM | QA Contact: | yhui | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | adellape, akashem, bandrade, chuo, dageoffr, dsover, ecordell, jfan, nhale, scolange | |
Version: | 4.2.0 | Keywords: | Reopened | |
Target Milestone: | --- | |||
Target Release: | 4.4.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1804812 (view as bug list) | Environment: | ||
Last Closed: | 2020-05-04 11:13:32 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1804812 | |||
Bug Blocks: |
Description
Jian Zhang
2019-09-13 01:55:01 UTC
3. Check the deployment of the etcd-operator if set the customized proxy. Actual results: The customized proxy values weren't injected in the operator deployment. Jian, can you please verify that this the custom proxy values were actually injected? you have crash looping pods, which happens when fake proxy env vars are injected to the Pod. Jian, I assume you are trying to validate whether you can override the default cluster proxy configuration from an existing Pod. There are two ways we can test this: 1. Specify valid proxy env vars in the Subscription. Because you specified fake env vars, the new pods stay in crashLoop state and the old pod never gets removed ( as Nick has explained ). If you specify valid proxy configuration in the Subscription, the new pod will report healthy and the old pod should be removed. 2. If you don't have valid proxy setup, then you can specify an empty value in the Subscription. Empty proxy var in subscription config is special, this is a way for an admin to override cluster 'proxy' configuration. So the new pod will not have any proxy env var injected. apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: etcd-config-test namespace: openshift-operators spec: config: env: - name: HTTP_PROXY channel: clusterwide-alpha installPlanApproval: Automatic name: etcd source: community-operators sourceNamespace: openshift-marketplace startingCSV: etcdoperator.v0.9.4-clusterwide You just need to specify one empty env var in Subscription config. Closing this since it looks like there's no issue here. Please re-open if the answers above don't suffice. Hi, Nick, Abu > I think this is just how rolling Deployments work; old pods aren't deleted until new pods are created (if that is the strategy the etcd-operator Deployment uses). That makes sense, thanks! > Empty proxy var in subscription config is special, this is a way for an admin to override cluster 'proxy' configuration. So the new pod will not have any proxy env var injected. OK, thanks for the explanation. I update Step 6, 7 of https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-24566 per this comment, please have a look, thanks! Abu, And, I create a subscription with an empty value below, but the no new pods generated for a long time, is it as expected? Reopen it first. mac:~ jianzhang$ cat sub-etcd-42-proxy.yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: etcd-config-test namespace: openshift-operators spec: config: env: - name: HTTP_PROXY channel: clusterwide-alpha installPlanApproval: Automatic name: etcd source: community-operators sourceNamespace: openshift-marketplace startingCSV: etcdoperator.v0.9.4-clusterwide mac:~ jianzhang$ oc get sub -n openshift-operators NAME PACKAGE SOURCE CHANNEL etcd-config-test etcd community-operators clusterwide-alpha mac:~ jianzhang$ oc get csv -n openshift-operators NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.4-clusterwide etcd 0.9.4-clusterwide etcdoperator.v0.9.2-clusterwide Succeeded mac:~ jianzhang$ oc get pods -n openshift-operators NAME READY STATUS RESTARTS AGE etcd-operator-9b67f8f96-j4nnn 3/3 Running 0 19m mac:~ jianzhang$ oc get pods etcd-operator-9b67f8f96-j4nnn -n openshift-operators -o yaml|grep -i "proxy" -A 2 - name: HTTP_PROXY value: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-77-137.us-east-2.compute.amazonaws.com:3129 - name: HTTPS_PROXY value: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-77-137.us-east-2.compute.amazonaws.com:3129 - name: NO_PROXY value: .cluster.local,.svc,.us-east-2.compute.internal,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.qe-jiazha3-proxy.qe.devcluster.openshift.com,api.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-0.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-1.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-2.qe-jiazha3-proxy.qe.devcluster.openshift.com,localhost,test.no-proxy.com image: quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b imagePullPolicy: IfNotPresent -- - name: HTTP_PROXY value: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-77-137.us-east-2.compute.amazonaws.com:3129 - name: HTTPS_PROXY value: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-77-137.us-east-2.compute.amazonaws.com:3129 - name: NO_PROXY value: .cluster.local,.svc,.us-east-2.compute.internal,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.qe-jiazha3-proxy.qe.devcluster.openshift.com,api.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-0.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-1.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-2.qe-jiazha3-proxy.qe.devcluster.openshift.com,localhost,test.no-proxy.com image: quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b imagePullPolicy: IfNotPresent -- - name: HTTP_PROXY value: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-77-137.us-east-2.compute.amazonaws.com:3129 - name: HTTPS_PROXY value: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-77-137.us-east-2.compute.amazonaws.com:3129 - name: NO_PROXY value: .cluster.local,.svc,.us-east-2.compute.internal,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.qe-jiazha3-proxy.qe.devcluster.openshift.com,api.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-0.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-1.qe-jiazha3-proxy.qe.devcluster.openshift.com,etcd-2.qe-jiazha3-proxy.qe.devcluster.openshift.com,localhost,test.no-proxy.com image: quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b imagePullPolicy: IfNotPresent Hi Jian, I tested on your cluster ( with global proxy ). I executed the following steps: 1. Create a namespace `test` 2. Create an OperatorGroup that targets `test` namespace. 3. Create a Subscription with no config 4. Wait for the pod to be in running state, the globl proxy env vars are injected. 5. Update the Subscription with config YAML files: apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: test namespace: test spec: targetNamespaces: - test --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: etcd-config-test namespace: test spec: channel: singlenamespace-alpha installPlanApproval: Automatic name: etcd source: community-operators sourceNamespace: openshift-marketplace Update the Subscription as follows: apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: etcd-config-test namespace: test spec: config: env: - name: HTTP_PROXY channel: singlenamespace-alpha installPlanApproval: Automatic name: etcd source: community-operators sourceNamespace: openshift-marketplace I was able to reproduce the issue, After I applied the updated subscription, the changes were not picked up. However, I was able to work around this issue by manually updating the 'etcd-operator' Deployment spec ( I added a new field to the `annotations` of spec.template.annotations of the Deployment ). New Pods came up without the global proxy env vars injected. The new pods were healthy. But, the new pod is injected with the empty proxy env var from subscription config. env: - name: MY_POD_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: MY_POD_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: HTTP_PROXY In summary, we have two issues: - If a subscription config is updated to override cluster proxy env variable(s), the change is not picked up by OLM. - Empty proxy env var specified in subscription config gets injected to generated Deployment object. I don't think this is release blocker, given that we have a way ( somewhat cumbersome ) to work around the issue. This can be addressed as a Z stream fix in 4.2, I believe. Moving to 4.3. Understand the issue but not critical enough to be considered a 4.2 release blocker. Will continue looking into this to deliver early in the 4.3 release timeframe and potentially ship under 4.2.z. Hi, Danile > Our default behavior is that if the global proxy object is configured and the user sets one of OLM will do nothing when reconciling the deployment. And, as described in that doc: "If the global proxy object is set and at least one of HTTPS_PROXY, HTTP_PROXY, NO_PROXY are set on the Subscription Then do nothing different. Global proxy config has been overridden by a user." Sorry, I'm confused, do you mean the HTTPS_PROXY, HTTP_PROXY, NO_PROXY are a whole? When just set the HTTP_PROXY, it also means set the HTTPS_PROXY and NO_PROXY with none? What the "Then do nothing different." means? Based on step 5 above, the deployment had been changed when setting one of HTTPS_PROXY, HTTP_PROXY, NO_PROXY are set on the Subscription. > The first deployment has HTTP_PROXY proxy set to none, it is ignored. When the variable is updated you expect OLM to recreate the deployment with the new env var value - I'm not entirely sure if this is the intended behavior. Could you confirm that enough time passed for an OLM sync cycle to occur (~15 minutes)? See step "7, Recreate the subscription with the non-empty config.", I recreate it, not update it. I'm sure that the deployment had been deleted before recreating it. Yes, after waiting for a little more time, it works. As follows: mac:~ jianzhang$ oc create -f sub-tsb-44.yaml subscription.operators.coreos.com/openshifttemplateservicebroker created mac:~ jianzhang$ date Fri Mar 6 14:02:56 CST 2020 mac:~ jianzhang$ cat sub-tsb-44.yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshifttemplateservicebroker namespace: openshift-template-service-broker spec: config: env: - name: HTTP_PROXY value: test_http channel: "4.4" installPlanApproval: Automatic name: openshifttemplateservicebroker source: qe-app-registry sourceNamespace: openshift-marketplace mac:~ jianzhang$ date Fri Mar 6 14:57:07 CST 2020 mac:~ jianzhang$ oc get deployment openshift-template-service-broker-operator -o json |jq '.spec.template.spec.containers[0].env' [ { "name": "IMAGE", "value": "image-registry.openshift-image-registry.svc:5000/openshift/ose-template-service-broker:v4.4.0" }, { "name": "OPERATOR_NAME", "value": "openshift-template-service-broker-operator" }, { "name": "POD_NAME", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "metadata.name" } } }, { "name": "WATCH_NAMESPACE", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "metadata.namespace" } } }, { "name": "HTTP_PROXY", "value": "test_http" } ] > so why not delete the old subscription and csv and create the new one with the updated proxy? I guess most of the customers updating them on subscription instead of recreating it. We cannot make sure the customers must recreating them instead of updating them. Besides, recreating means the service must be interrupted, it's not a good solution. Description of problem: Based on the comments 22, I tested the bug again. If the behavior is happening as designed/documented according to comments 22, I think the bug can be changed to verified. Version-Release number of selected component (if applicable): cluster version is 4.5.0-0.nightly-2020-03-17-225152. $ oc exec -n openshift-operator-lifecycle-manager catalog-operator-69b7cd5db9-mjnq7 -- olm --version OLM version: 0.14.2 git commit: 3455a009647abeb4f1791b3539a9a660411b8895 Steps to Reproduce: 1. Create the cluster with proxy enabled. 2. Subscribe an operator without any proxy config. 3. Check the proxy of the pods. The cluster global proxy is injected to pod. { "name": "HTTP_PROXY", "value": "http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-157-37.us-east-2.compute.amazonaws.com:3128" }, { "name": "HTTPS_PROXY", "value": "http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-17-157-37.us-east-2.compute.amazonaws.com:3128" }, { "name": "NO_PROXY", "value": ".cluster.local,.svc,.us-east-2.compute.internal,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.yhui-0318proxy.qe.devcluster.openshift.com,etcd-0.yhui-0318proxy.qe.devcluster.openshift.com,etcd-1.yhui-0318proxy.qe.devcluster.openshift.com,etcd-2.yhui-0318proxy.qe.devcluster.openshift.com,localhost,test.no-proxy.com" } 4. Update the subscription with empty proxy config. config: env: - name: HTTP_PROXY 5. Check the proxy of the pods. The empty proxy config is updated to pod. { "name": "HTTP_PROXY" } 6. Delete the sub and csv. 7. Create a subscription with below empty proxy config. config: env: - name: HTTP_PROXY 8. Check the proxy of the pods after about 10 minutes, the empty proxy config is injected to pod. { "name": "HTTP_PROXY" } 9. Delete the sub and csv. 10. Create a subscription with below non-empty proxy config. config: env: - name: HTTP_PROXY value: test_http 11. Check the proxy of the pods after about 10 minutes, the non-empty proxy config is injected to pod. { "name": "HTTP_PROXY", "value": "test_http" } Since the results are as designed based on comments 22, I change the status to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |