Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1541222

Summary:	Unnecessary docker_image_availability performed on non-containerized etcd nodes
Product:	OpenShift Container Platform	Reporter:	Takayoshi Kimura <tkimura>
Component:	Installer	Assignee:	Luke Meyer <lmeyer>
Status:	CLOSED WORKSFORME	QA Contact:	Gan Huang <ghuang>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.7.0	CC:	aos-bugs, asathe, jokerman, mmccomas, tkimura, wmeng
Target Milestone:	---	Keywords:	NeedsTestCase
Target Release:	3.10.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-02-19 07:32:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1499358
Bug Blocks:

Description Takayoshi Kimura 2018-02-02 03:09:18 UTC

Description of problem:

In proxy env the docker_image_availability failed due to another bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1499358

Setting proxy for docker registry access on all hosts other than etcd hosts, still failed for same issue. It turns out that etcd hostss also needs proxy settings for docker registry. This is NOT containerized install so this failure doesn't make sense.

Version-Release number of the following components:

$ rpm -q openshift-ansible
openshift-ansible-3.7.14-1.git.0.4b35b2d.el7.noarch
$ rpm -q ansible
ansible-2.4.1.0-1.el7.noarch
$ ansible --version
ansible 2.4.1.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/nekop/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, May  3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)]

How reproducible:

Always

Steps to Reproduce:
1.
2.
3.

Actual results:

docker_image_availability performed on non-docker hosts like etcd

Expected results:

No docker_image_availability performed on non-docker hosts like etcd


Additional info:

Comment 1 Takayoshi Kimura 2018-02-02 03:10:44 UTC

For workaround, set proxy for docker registry on etcd hosts as well, or disable docker_image_availability check.

Comment 2 Ameya Sathe 2018-02-06 13:38:24 UTC

(In reply to Takayoshi Kimura from comment #1)
> For workaround, set proxy for docker registry on etcd hosts as well, or
> disable docker_image_availability check.

Setting the proxy for docker registry on etcd hosts results in 
" STDERR:
Error:  client: etcd cluster is unavailable or misconfigured; error #0: Forbidden
error #0: Forbidden
MSG:
non-zero return code
"
Service etcd was running fine.  Error reported, "Invalid TLS certificate. retry"

Hence, the only option is to disable docker_image_availability check for the complete duration of the upgrade.

Comment 3 Takayoshi Kimura 2018-02-07 00:35:58 UTC

Ameya, it sounds like you didn't configure NO_PROXY for etcd hosts IP addresses when configure the proxy.

Comment 4 Scott Dodson 2018-02-07 13:30:54 UTC

Luke,

I think this should be fixed by not running those checks on non containerized etcd hosts. Let me know if you think that's a bad idea or impossible to implement.

Comment 5 Luke Meyer 2018-02-07 21:09:42 UTC

> I think this should be fixed by not running those checks on non
> containerized etcd hosts.

I agree; in fact I think it already works that way.

docker_image_availability is already being fixed to go through proxies if specified, so that side of it should be taken care of as we ship those fixes. We certainly have had problems setting proxies globally on etcd hosts and then having etcd trying to sync via proxy; so that's not the right approach on an etcd host.

Proxies aside, this bug seems to indicate that a standalone, non-containerized etcd host is failing on this check. Looking at the code, that should not be happening. Looking at the case, the initial error message when lacking proxy is given, but not the error message from the etcd host after that was resolved in the rest of the cluster. Also I do not see how containerized/not-containerized is configured. If "containerized=true" is set globally, then standalone etcd hosts will be considered containerized too. I can't tell if that's what happened here, but would like to see the inventory/variables and error output to make sure we understand what exactly is happening.

Comment 6 Takayoshi Kimura 2018-02-13 00:38:09 UTC

The etcd doesn't have proxy specified because it's not needed, rpm repos are available in internal network. This is proxy env, non-containerized install.

Comment 7 Luke Meyer 2018-02-13 13:50:48 UTC

(In reply to Takayoshi Kimura from comment #6)
> This is proxy env, non-containerized install.

I understand; that's why I'd like to see the ansible inventory and most recent output from the check failure to help figure out why the check is behaving as if etcd is supposed to be containerized. As far as I can see, the code should be skipping the check for a standalone non-containerized etcd.

Comment 8 Takayoshi Kimura 2018-02-19 07:32:40 UTC

Thanks Luke.

You're right, We got full ansible log and confirmed that there is no error with docker_image_availability check on etcd hosts. The failure was coming from different place due to proxy env var configuration for docker_image_availability proxy issue https://bugzilla.redhat.com/show_bug.cgi?id=1541222 , it was missing no_proxy configuration. Closing.