Bug 1391805
Summary: | backup etcd failed when upgrade openshift 3.2 | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Anping Li <anli> | ||||
Component: | Cluster Version Operator | Assignee: | Jason DeTiberus <jdetiber> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Anping Li <anli> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.2.1 | CC: | anli, aos-bugs, dgoodwin, jokerman, mmccomas, tobias.genannt | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-11-08 12:33:25 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Anping Li
2016-11-04 05:26:00 UTC
hit same issue when upgrade openshift 3.2 with the external etcd Created attachment 1217294 [details]
Upgrade logs
I believe this is a known issue with ansible and hosts with long hostnames, for example we have to work around this when using AWS by editing by setting /etc/ansible/ansible.cfg param: control_path = %(directory)s/ansible-ssh-%%C More information available here: http://docs.ansible.com/ansible/intro_configuration.html#control-path It seems the control path doesn't work. and I didn't use long hostname and home directory. the socket names seems less than 108 characters. ansible-2.2.0.0-0.100.el7.noarch openshift-ansible-3.2.37-1.git.0.8f013d0.el7.noarch PLAY [Backup etcd] ************************************************************* TASK [setup] ******************************************************************* Using module file /usr/lib/python2.7/site-packages/ansible/modules/core/system/setup.py <groups.oo_etcd_to_config if groups.oo_etcd_to_config is defined and groups.oo_etcd_to_config | length > 0 else groups.oo_first_master> ESTABLISH SSH CONNECTION FOR USER: None <groups.oo_etcd_to_config if groups.oo_etcd_to_config is defined and groups.oo_etcd_to_config | length > 0 else groups.oo_first_master> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 'groups.oo_etcd_to_config if groups.oo_etcd_to_config is defined and groups.oo_etcd_to_config | length > 0 else groups.oo_first_master' '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1478505418.18-152149785284243 `" && echo ansible-tmp-1478505418.18-152149785284243="` echo $HOME/.ansible/tmp/ansible-tmp-1478505418.18-152149785284243 `" ) && sleep 0'"'"'' fatal: [groups.oo_etcd_to_config if groups.oo_etcd_to_config is defined and groups.oo_etcd_to_config | length > 0 else groups.oo_first_master]: UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 57: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\nControlPath too long\r\n", "unreachable": true } to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_2/upgrade.retry PLAY RECAP ********************************************************************* groups.oo_etcd_to_config if groups.oo_etcd_to_config is defined and groups.oo_etcd_to_config | length > 0 else groups.oo_first_master : ok=0 changed=0 unreachable=1 failed=0 localhost : ok=13 changed=8 unreachable=0 failed=0 openshift-190.lab.eng.nay.redhat.com : ok=86 changed=1 unreachable=0 failed=0 In your previous comment we can see that the control path fix is not in effect: "ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r" It should be using "control_path = %(directory)s/%%h-%%r" per the link above. Also note that it must be in the [ssh_connection] of ansible.cfg, and it may be ignored if you are using custom ssh_args. Please attach /etc/ansible/ansible.cfg if the problem still persists. May also be able to set it on CLI with the ANSIBLE_SSH_CONTROL_PATH environment variable. ANSIBLE_SSH_CONTROL_PATH=/root/.ansible/cp/%%h-%%r example. This looks to have surfaced with a customer and the other bugzilla has caught something we did not notice yet, closing this one as duplicate, lets resume in 1392169. *** This bug has been marked as a duplicate of bug 1392169 *** So, depending on the generated hostname, /root/.ansible/cp/%%h-%%r could still be too long. Switching to someething like /tmp/cp/%%h-%%r could solve the problem, as could using shorter hostnames. |