Bug 1573307
| Summary: | FFU: ceph upgrade fails because Docker service is not running on the Ceph OSD nodes | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
| Component: | rhosp-director | Assignee: | RHOS Maint <rhos-maint> |
| Status: | CLOSED NOTABUG | QA Contact: | Amit Ugol <augol> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 13.0 (Queens) | CC: | dbecker, lbezdick, mburns, morazi |
| Target Milestone: | beta | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-05-02 15:24:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Upgrade step - openstack overcloud upgrade run - has to run on all nodes including Ceph. |
Description of problem: FFU: ceph upgrade fails because Docker service is not running on the Ceph OSD nodes, snippet from /var/log/mistral/ceph-install-workflow.log: [...] 2018-04-30 15:53:22,353 p=11902 u=mistral | task path: /usr/share/ceph-ansible/roles/ceph-docker-common/tasks/fetch_image.yml:179 2018-04-30 15:53:22,353 p=11902 u=mistral | Monday 30 April 2018 15:53:22 -0400 (0:00:00.036) 0:06:32.099 ********** 2018-04-30 15:53:22,788 p=11902 u=mistral | FAILED - RETRYING: pulling registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest image (3 retries left). 2018-04-30 15:53:33,033 p=11902 u=mistral | FAILED - RETRYING: pulling registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest image (2 retries left). 2018-04-30 15:53:43,266 p=11902 u=mistral | FAILED - RETRYING: pulling registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest image (1 retries left). 2018-04-30 15:53:53,508 p=11902 u=mistral | fatal: [192.168.24.10]: FAILED! => {"attempts": 3, "changed": false, "cmd": ["timeout", "300s", "docker", "pull", "registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest"], "delta": "0:00:00.025122", "end": "2018-04-30 19:53:52.221277", "msg": "non-zero return code", "rc": 1, "start": "2018-04-30 19:53:52.196155", "stderr": "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?", "stderr_lines": ["Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"], "stdout": "", "stdout_lines": []} 2018-04-30 15:53:53,510 p=11902 u=mistral | PLAY RECAP ********************************************************************* 2018-04-30 15:53:53,511 p=11902 u=mistral | 192.168.24.10 : ok=42 changed=4 unreachable=0 failed=1 2018-04-30 15:53:53,511 p=11902 u=mistral | 192.168.24.12 : ok=121 changed=26 unreachable=0 failed=0 2018-04-30 15:53:53,511 p=11902 u=mistral | 192.168.24.13 : ok=111 changed=21 unreachable=0 failed=0 2018-04-30 15:53:53,511 p=11902 u=mistral | 192.168.24.18 : ok=2 changed=0 unreachable=0 failed=0 2018-04-30 15:53:53,511 p=11902 u=mistral | 192.168.24.19 : ok=110 changed=22 unreachable=0 failed=0 2018-04-30 15:53:53,511 p=11902 u=mistral | 192.168.24.23 : ok=2 changed=0 unreachable=0 failed=0 2018-04-30 15:53:53,511 p=11902 u=mistral | localhost : ok=0 changed=0 unreachable=0 failed=0 2018-04-30 15:53:53,512 p=11902 u=mistral | Monday 30 April 2018 15:53:53 -0400 (0:00:31.158) 0:07:03.257 ********** 2018-04-30 15:53:53,512 p=11902 u=mistral | =============================================================================== Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-8.0.2-4.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. openstack overcloud ffwd-upgrade prepare 2. openstack overcloud ffwd-upgrade run 3. openstack overcloud upgrade run --roles Controller --skip-tags validation 4. openstack overcloud upgrade run --roles Compute --skip-tags validation 5. openstack overcloud ffwd-upgrade converge 6. openstack overcloud ceph-upgrade run \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \ -e /home/stack/virt/internal.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/ffu_repos.yaml \ -e /home/stack/cli_opts_params.yaml \ -e /home/stack/ceph-ansible-env.yaml \ --ceph-ansible-playbook '/usr/share/ceph-ansible/infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml,/usr/share/ceph-ansible/infrastructure-playbooks/rolling_update.yml' Actual results: switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook fails because the Docker service on ceph OSD nodes is not running Expected results: Ceph upgrade playbook finish without errors. Additional info: