Description of problem: Do embedded etcd migrate against rpm non-ha v3.6 ocp with an embedded, migrate playbook run successfully, But after migration, the ocp does not work. For example: 1) atomic-openshift-master.service restart in loop # systemctl status atomic-openshift-master.service ● atomic-openshift-master.service - Atomic OpenShift Master Loaded: loaded (/etc/systemd/system/atomic-openshift-master.service; enabled; vendor preset: disabled) Active: activating (auto-restart) (Result: exit-code) since Fri 2017-10-20 05:57:35 EDT; 3s ago Docs: https://github.com/openshift/origin Process: 49089 ExecStart=/usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS (code=exited, status=255) Main PID: 49089 (code=exited, status=255) Oct 20 05:57:35 x-embed-master-nfs-1 systemd[1]: atomic-openshift-master.service: main process exited, code=exited, status=255/n/a Oct 20 05:57:35 x-embed-master-nfs-1 systemd[1]: Failed to start Atomic OpenShift Master. Oct 20 05:57:35 x-embed-master-nfs-1 systemd[1]: Unit atomic-openshift-master.service entered failed state. Oct 20 05:57:35 x-embed-master-nfs-1 systemd[1]: atomic-openshift-master.service failed. 2) "oc get" can now get any data # oc get node The connection to the server x-embed-master-nfs-1:8443 was refused - did you specify the right host or port? =============================== Check master log, master try to connect itself(10.240.0.49) but not new etcd host(10.240.0.56) getsockopt: connection refused"; Reconnecting to {10.240.0.49:2379 <nil>} # cat /etc/etcd/etcd.conf | grep LISTEN ETCD_LISTEN_PEER_URLS=https://10.240.0.56:2380 ETCD_LISTEN_CLIENT_URLS=https://10.240.0.56:2379 Version-Release number of the following components: openshift-ansible-3.7.0-0.167.0.git.0.0e34535.el7.noarch How reproducible: always Steps to Reproduce: 1. Install v3.6 ocp with embedded etcd 2. Prepare repos on a new host(just install docker on it) 3. Edit hosts file to add etcd group [OSEv3:children] ... etcd ... [etcd] hostname... //Specify a new host for etcd. 4. Do etcd migrate # ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-etcd/embedded2external.yml Actual results: OCP does not work after migrate Expected results: OCP should works well after migrate Additional info: Please attach logs from ansible-playbook with the -vvv flag
I am able to reproduce it, I know what is wrong, I got a fix for it. I will open a PR in a few.
Upstream PR: https://github.com/openshift/openshift-ansible/pull/5843
Version: openshift-ansible-3.7.0-0.179.0.git.0.a2641b6.el7.noarch Steps: 1. Install v3.6 ocp with embedded etcd 2. Prepare repos on a new host(just install docker on it) 3. Edit hosts file to add etcd group [OSEv3:children] ... etcd ... [etcd] hostname... //Specify a new host for etcd. 4. Do etcd migrate # ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-etcd/embedded2external.yml After migrate to external etcd, it works well now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188