Description of problem: As described in [1], openshift_{http,https,no}_proxy set the proxy for master and docker only. There are no way to add proxy variables to OpenShift Node's /etc/sysconfig/atomic-openshift-node. Version-Release number of selected component (if applicable): atomic-openshift-3.2.1.13-1.git.0.e438b0e.el7.x86_64 atomic-openshift-utils-3.2.24-1.git.0.337259b.el7.noarch Actual results: There are no way to set proxy variabled to /etc/sysconfig/atomic-openshift-node by ansible installer Expected results: New variables openshift_node_http_proxy (or existing openshift_http_proxy) should support HTTP_PROXY Additional info: Without setting the variables, the installation fails if there are proxy between OpenStack API and OpenShift Node. (e.g) Sep 07 02:47:24 knakayam-ose32-single-master.os1.phx2.redhat.com atomic-openshift-node[113143]: F0907 02:47:24.304683 113143 start_node.go:124] could not init cloud provider "openstack": Post http://foo.openstack.redhat.com:5000/v2.0/tokens: http: error connecting to proxy 10.0.1.1:8080/: dial tcp 10.0.1.1:8080: i/o timeout [1] https://docs.openshift.com/enterprise/3.2/install_config/install/advanced_install.html#advanced-install-configuring-global-proxy
*** Bug 1375271 has been marked as a duplicate of this bug. ***
*** Bug 1375414 has been marked as a duplicate of this bug. ***
Created attachment 1200567 [details] Ansible install inventory file used to install The proxy server (squid) is running on proxy.rhdemo.net.
The fix included in this build applies proxy configuration settings to the node services as well. The user will need to properly determine the correct http_proxy, https_proxy, and no_proxy values to define for their environment whether that's AWS, GCE, or OpenStack.
1. Reproduced with openshift-ansible-3.2.28-1.git.0.5a85fc5.el7.noarch.rpm Installation failed at TASK [openshift_node : Start and enable node again] #cat /var/log/messages <--snip--> - Sep 13 23:36:22 qe-ghuang-master-nfs-1 atomic-openshift-node: F0913 23:36:22.078780 4870 start_node.go:124] could not init cloud provider "openstack": Post http://xxxxxx.redhat.com:5000/v2.0/tokens: dial tcp: lookup xxxxxx.redhat.com: Temporary failure in name resolution <--snip--> 2. Then test aganist openshift-ansible-3.2.29-1.git.0.2b76696.el7.noarch.rpm Installation successed. Will test again once pushed to new puddle.
1, Check the variable of "NO_PROXY" after installation: # grep NO_PROXY /etc/sysconfig/atomic-openshift-node NO_PROXY=.cluster.local,169.254.169.254,qe-ghuang-preserve-master,qe-ghuang-preserve-node 2, Docker registry failed to deploy and was in status "CrashLoopBackOff" # oc describe po docker-registry-1-jyqvx <--snip--> 1m 1m 1 {kubelet qe-ghuang-preserve-node} spec.containers{registry} Normal Started Started container with docker id 33eda6dc7a63 1m 1m 1 {kubelet qe-ghuang-preserve-node} spec.containers{registry} Normal Created Created container with docker id 33eda6dc7a63 1m 1m 1 {kubelet qe-ghuang-preserve-node} spec.containers{registry} Normal Killing Killing container with docker id 33eda6dc7a63: pod "docker-registry-1-jyqvx_default(47555aad-7d70-11e6-b5e8-fa163ea980d4)" container "registry" is unhealthy, it will be killed and re-created. 6m 1m 16 {kubelet qe-ghuang-preserve-node} spec.containers{registry} Warning Unhealthy Readiness probe failed: HTTP probe failed with statuscode: 503 6m 1m 9 {kubelet qe-ghuang-preserve-node} spec.containers{registry} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 503 1m 14s 8 {kubelet qe-ghuang-preserve-node} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "registry" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=registry pod=docker-registry-1-jyqvx_default(47555aad-7d70-11e6-b5e8-fa163ea980d4)" 4m 14s 22 {kubelet qe-ghuang-preserve-node} spec.containers{registry} Warning BackOff Back-off restarting failed docker container 3, Add the cluster network and pod network into NO_PROXY like this: # grep NO_PROXY /etc/sysconfig/atomic-openshift-node NO_PROXY=.cluster.local,169.254.169.254,qe-ghuang-preserve-master,qe-ghuang-preserve-node,172.30.0.0/16,10.1.0.0/16 4, Re-deploy docker-registry and successed. @Scott, from the testing above, looks like we also need add the cluster network and pod network into NO_PROXY.
(In reply to Gan Huang from comment #18) > 1, Check the variable of "NO_PROXY" after installation: > # grep NO_PROXY /etc/sysconfig/atomic-openshift-node > NO_PROXY=.cluster.local,169.254.169.254,qe-ghuang-preserve-master,qe-ghuang- > preserve-node > > 2, Docker registry failed to deploy and was in status "CrashLoopBackOff" > # oc describe po docker-registry-1-jyqvx > > <--snip--> > > @Scott, from the testing above, looks like we also need add the cluster > network and pod network into NO_PROXY. Here's the NO_PROXY on a working system for an atomic-openshift-node: NO_PROXY=.cluster.local,.rhdemo.net,proxy-master1.rhdemo.net,proxy-node1.rhdemo.net,proxy-node2.rhdemo.net,proxy-node3.rhdemo.net,172.30.0.0/16,10.1.0.0/16 I'm not sure each node should be in the NOPROXY but that seems to be what ansible does. Only node to master would be an issue here - but that's for a different ticket. The docker registry IP needs to be part of NO_PROXY of the docker daemon /etc/sysconfig/docker - this means, that once oadm route has been run, the service IP needs to be injected into the running docker daemon on all anodes, and then the daemon needs to be restarted. If this is skipped, then builds will fail when they try to push images into the registry.
Customer tried the install packages. This may be related to the issues already presented, but the installer seems to stall out on cluster facts when using the brew packages. Posting details in pc
Gan, Thanks added kubesvc and cluster cidrs to /etc/sysconfig/atomic-openshift-node. https://github.com/openshift/openshift-ansible/pull/2466 Backported to 3.2 and 3.3 installers.
Test aganist openshift-ansible-3.2.30-1 Installation failed at: TASK [openshift_node : Configure Proxy Settings] ******************************* [DEPRECATION WARNING]: Skipping task due to undefined Error, in the future this will be a fatal error.: 'dict object' has no attribute 'master'. This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. fatal: [qe-ghuang-preserve-node]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'item' is undefined\n\nThe error appears to have been in '/root/openshift-ansible/roles/openshift_node/tasks/systemd_units.yml': line 51, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Configure Proxy Settings\n ^ here\n"} changed: [qe-ghuang-preserve-master] => (item={u'regex': u'^HTTP_PROXY=', u'line': u'HTTP_PROXY=http://192.168.1.84:3128'}) => {"backup": "", "changed": true, "item": {"line": "HTTP_PROXY=http://192.168.1.84:3128", "regex": "^HTTP_PROXY="}, "msg": "line added"} changed: [qe-ghuang-preserve-master] => (item={u'regex': u'^HTTPS_PROXY=', u'line': u'HTTPS_PROXY=http://192.168.1.84:3128'}) => {"backup": "", "changed": true, "item": {"line": "HTTPS_PROXY=http://192.168.1.84:3128", "regex": "^HTTPS_PROXY="}, "msg": "line added"} changed: [qe-ghuang-preserve-master] => (item={u'regex': u'^NO_PROXY=', u'line': u'NO_PROXY=.cluster.local,169.254.169.254,qe-ghuang-preserve-master,qe-ghuang-preserve-node,172.30.0.0/16,10.1.0.0/16'}) => {"backup": "", "changed": true, "item": {"line": "NO_PROXY=.cluster.local,169.254.169.254,qe-ghuang-preserve-master,qe-ghuang-preserve-node,172.30.0.0/16,10.1.0.0/16", "regex": "^NO_PROXY="}, "msg": "line added"} Looks good to me when I modify the file like this(hopefully it's useful to you): diff --git a/roles/openshift_node/tasks/systemd_units.yml b/roles/openshift_node/tasks/systemd_units.yml index c20eed8..f8d6929 100644 --- a/roles/openshift_node/tasks/systemd_units.yml +++ b/roles/openshift_node/tasks/systemd_units.yml @@ -60,7 +60,7 @@ - regex: '^HTTPS_PROXY=' line: "HTTPS_PROXY={{ openshift.common.https_proxy }}" - regex: '^NO_PROXY=' - line: "NO_PROXY={{ openshift.common.no_proxy | join(',') }},{{ openshift.common.portal_net }},{{ openshift.master.sdn_cluster_network_cidr }}" + line: "NO_PROXY={{ openshift.common.no_proxy | join(',') }},{{ hostvars[groups.oo_first_master.0].openshift.common.portal_net }},{{ hostvars[groups.oo_first_master.0] when: "{{ openshift.common.http_proxy is defined and openshift.common.http_proxy != '' }}" notify: - restart node
(In reply to Gan Huang from comment #24) > Looks good to me when I modify the file like this(hopefully it's useful to > you): > - line: "NO_PROXY={{ openshift.common.no_proxy | join(',') }},{{ > openshift.common.portal_net }},{{ openshift.master.sdn_cluster_network_cidr > }}" > + line: "NO_PROXY={{ openshift.common.no_proxy | join(',') }},{{ > hostvars[groups.oo_first_master.0].openshift.common.portal_net }},{{ > hostvars[groups.oo_first_master.0] Thanks, my test inventory had both nodes as masters so I missed this. Fixing it.
Verified with openshift-ansible-3.2.31-1.git.0.203df76.el7.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1984