Bug 1432020

Summary: Advanced installation cookbook advanced installation fails with no proxy settings
Product: OpenShift Container Platform Reporter: Vítor Corrêa <vcorrea>
Component: InstallerAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED CURRENTRELEASE QA Contact: Johnny Liu <jialiu>
Severity: high Docs Contact:
Priority: high    
Version: 3.4.0CC: aos-bugs, bleanhar, jkaur, jokerman, mmccomas, myllynen, vrutkovs, wmeng, wsun
Target Milestone: ---Keywords: TestBlocker
Target Release: 3.10.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-27 18:15:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vítor Corrêa 2017-03-14 10:45:08 UTC
Description of problem:

When using following settings in ansible host file with advanced installation cookbook advanced installation fails while trying to startup the atomic-openshift-node.service

osm_default_subdomain=subdomain.domain
osm_cluster_network_cidr=10.184.0.0/16
osm_host_subnet_length=8

openshift_master_cluster_hostname=cluster.domain
openshift_master_cluster_public_hostname=cluster.domain

openshift_http_proxy=http://proxy.domain:8080/
openshift_https_proxy=https://proxy.domain:8080/
openshift_no_proxy=subdomain.domain


ERROR: cannot fetch "default" cluster network https://cluster.domain:8443/oapi/v1/cluster networks/default: Service Unavailable


Node config file /etc/sysconfig/atomic-openshift-node

HTTP_PROXY=http://proxy.domain:8080/
HTTPS_PROXY=https://proxy.domain:8080/
NO_PROXY=.cluster.local,subdomain.domain,master1.domain,master2.domain,172.30.0.0/16,10.184.0.0/16

This was Working on 3.3, but for OCP 3.4, we must specify:
openshift_no_proxy=subdomain.domain,cluster.domain


Wouldn't it make sense to add openshift_master_cluster_hostname to the list of automated augmented list as described in https://docs.openshift.com/container-platform/3.4/install_config/install/advanced_install.html#advanced-install-configuring-global-proxy
?

Version-Release number of selected component (if applicable):
3.4

Actual results:
ERROR: cannot fetch "default" cluster network https://cluster.domain:8443/oapi/v1/cluster networks/default: Service Unavailable

Expected results:
successful installation

Comment 1 Marko Myllynen 2017-04-06 12:28:17 UTC
(In reply to Vítor Corrêa from comment #0)
> 
> openshift_master_cluster_hostname=cluster.domain
> openshift_master_cluster_public_hostname=cluster.domain
> 
> This was Working on 3.3, but for OCP 3.4, we must specify:
> openshift_no_proxy=subdomain.domain,cluster.domain
> 
> Wouldn't it make sense to add openshift_master_cluster_hostname to the list
> of automated augmented list as described in
> https://docs.openshift.com/container-platform/3.4/install_config/install/
> advanced_install.html#advanced-install-configuring-global-proxy

I think so, too, yes. FWIW, somewhat related: https://bugzilla.redhat.com/show_bug.cgi?id=1414749. Thanks.

Comment 5 Gan Huang 2018-01-25 09:15:33 UTC
Tested in master branch, no fix for it now.

openshift_master_cluster_hostname not added into no_proxy list

Comment 6 Scott Dodson 2018-04-18 18:04:01 UTC
*** Bug 1568694 has been marked as a duplicate of this bug. ***

Comment 7 Scott Dodson 2018-04-18 18:04:28 UTC
*** Bug 1462652 has been marked as a duplicate of this bug. ***

Comment 8 Scott Dodson 2018-04-18 18:05:02 UTC
The two bugs I just duped against this are all because openshift_master_cluster_hostname is not added to the noproxy list by default. We need to fix that.

Comment 9 Vadim Rutkovsky 2018-06-18 12:03:22 UTC
Created PR for master (3.11) - https://github.com/openshift/openshift-ansible/pull/8809

Comment 15 Gan Huang 2018-06-20 01:56:08 UTC
(In reply to Vadim Rutkovsky from comment #14)

> I think its being added correctly, it uses actual cluster hostname -
> qe-wmeng310proxy3-nrr-1 - when adding it to a noproxy list.

Please see the attachment in comment 12:

openshift_master_cluster_hostname=qe-wmeng310proxy3-lb-nfs-1

> 
> In 3.10 internal hostnames are very important, these should be the same as
> hostnames defined in inventory. Could you make sure these match and try
> again?
> 

I don't think it matters, if we have internal hostname set correctly (could be resolved to internal IP) on the nodes, we don't have to specify `openshift_hostname`.

Comment 16 Vadim Rutkovsky 2018-06-20 09:52:57 UTC
Created a PR which fixes the previous one - https://github.com/openshift/openshift-ansible/pull/8863. This should append openshift_master_cluster_hostname correctly.

Comment 17 Scott Dodson 2018-06-20 12:06:21 UTC
https://github.com/openshift/openshift-ansible/pull/8865 release-3.10 pick

Comment 18 Wei Sun 2018-06-21 03:32:05 UTC
The PR is merged to v3.10.2-1,please check it.

Comment 19 Gan Huang 2018-06-25 09:54:12 UTC
Verified in openshift-ansible-3.10.7-1.git.220.50204c4.el7.noarch.rpm

While "openshift_master_cluster_hostname=ghuang-bug-lb-nfs-1" specified in inventory file, the hostname now could be added into NO_PROXY list correctly.

# grep "ghuang-bug-lb-nfs-1" /etc/origin/master/master.env
NO_PROXY=.centralci.eng.rdu2.redhat.com,.cluster.local,.lab.sjc.redhat.com,.svc,10.14.89.4,169.254.169.254,172.16.120.31,172.16.120.61,172.16.120.79,172.31.0.1,ghuang-bug-lb-nfs-1,ghuang-bug-master-etcd-1,ghuang-bug-master-etcd-2,ghuang-bug-master-etcd-3,ghuang-bug-node-1,ghuang-bug-node-2,ghuang-bug-node-registry-router-1,172.31.0.0/16,10.2.0.0/16

But the installation didn't complete, instead failed at task "Wait for all control plane pods to become ready". Tracking in Bug 1594726.