Bug 1739209 - [ceph-ansible] - rolling-update of containerized cluster from 2.x to 3.x failed trying to run systemd-device-to-id.sh saying no such file
Summary: [ceph-ansible] - rolling-update of containerized cluster from 2.x to 3.x fail...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 3.3
Assignee: Guillaume Abrioux
QA Contact: Vasishta
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-08 18:34 UTC by Vasishta
Modified: 2019-08-21 15:11 UTC (History)
8 users (show)

Fixed In Version: RHEL: ceph-ansible-3.2.24-1.el7cp Ubuntu: ceph-ansible_3.2.24-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-21 15:11:10 UTC
Embargoed:


Attachments (Terms of Use)
File contains playbook log (1.87 MB, text/plain)
2019-08-08 18:34 UTC, Vasishta
no flags Details
File contains playbook log (3.77 MB, text/plain)
2019-08-08 19:57 UTC, Vasishta
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 4327 0 None closed osd: copy systemd-device-to-id.sh on all osd nodes before running it 2020-11-02 21:10:27 UTC
Github ceph ceph-ansible pull 4332 0 None closed update: use ids to restart osds instead of device name 2020-11-02 21:10:26 UTC
Red Hat Product Errata RHSA-2019:2538 0 None None None 2019-08-21 15:11:24 UTC

Description Vasishta 2019-08-08 18:34:41 UTC
Created attachment 1601912 [details]
File contains playbook log

Description of problem:
Rolling update from ceph-ansible 2.x to 3.x failed in task "ceph-osd : run the systemd-device-to-id.sh script" saying "No such file or directory"

It seems like task which copies the script has not been delegated to other nodes.

Version-Release number of selected component (if applicable):
ceph-ansible-3.2.22-1.el7cp.noarch

How reproducible:
Always (1/1)

Steps to Reproduce:
1. Get a RHCS 2.x containerized cluster (With OSDs having device name in their service name) 
2. Try to upgrade it to 3.3

Actual results:
"bash: /tmp/systemd-device-to-id.sh: No such file or directory"

Expected results:
rolling-update must complete successfully

Additional info:

Comment 1 Vasishta 2019-08-08 19:57:59 UTC
Created attachment 1601932 [details]
File contains playbook log

I think following lines from start_osds.yml needs to be removed

https://github.com/ceph/ceph-ansible/blob/stable-3.2/roles/ceph-osd/tasks/start_osds.yml#L131-L132

It seemed to be be working for me, cluster got updated and all new OSD services are up.

But old services (service with device name) were present and flapping on nodes on which script was not run first time when I had initiated (Logs of run 1 is at previous attachment).

Regards,
Vasishta Shastry
QE, Ceph

Comment 12 Vasishta 2019-08-13 17:43:13 UTC
Working fine with ceph-ansible-3.2.24-1.el7cp.noarch
Moving to VERIFIED state

Comment 14 errata-xmlrpc 2019-08-21 15:11:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538


Note You need to log in before you can comment on or make changes to this bug.