Bug 2073480 - [ceph-ansible] cephadm-adopt playbook fails on TASK [install cephadm] on dedicated OSD nodes during upgrade
Summary: [ceph-ansible] cephadm-adopt playbook fails on TASK [install cephadm] on dedi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 5.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 5.1z2
Assignee: Guillaume Abrioux
QA Contact: Sayalee
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 2099589
TreeView+ depends on / blocked
 
Reported: 2022-04-08 15:07 UTC by Gaurav Sitlani
Modified: 2022-07-05 08:50 UTC (History)
12 users (show)

Fixed In Version: ceph-ansible-6.0.25.7-1.el8cp
Doc Type: Bug Fix
Doc Text:
.Adoption playbook can now install `cephadm` on OSD nodes Previously, due to the tools repository being disabled on OSD nodes, you could not install `cephadm` OSD nodes resulting in the failure of the adoption playbook. With this fix, the tools repository is enabled on OSD nodes and the adoption playbook can now install `cephadm` on OSD nodes.
Clone Of:
Environment:
Last Closed: 2022-06-30 20:54:48 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 7173 0 None Merged [skip ci] common: config rhcs tools repo on all nodes 2022-07-05 08:13:21 UTC
Red Hat Issue Tracker RHCEPH-3955 0 None None None 2022-04-08 15:09:43 UTC
Red Hat Product Errata RHBA-2022:5450 0 None None None 2022-06-30 20:55:15 UTC

Description Gaurav Sitlani 2022-04-08 15:07:56 UTC
Description of problem:

The following failure is observed while upgrading RHCS to 5.0z4 release on dedicated OSD nodes :

2022-04-05 12:09:54,181 p=3484089 u=root n=ansible | TASK [install cephadm] *****************************************************************************************************************************************************************
*****************************************************
2022-04-05 12:09:54,181 p=3484089 u=root n=ansible | Tuesday 05 April 2022  12:09:54 +0000 (0:00:13.865)       0:01:11.768 ********* 
2022-04-05 12:09:59,761 p=3484089 u=root n=ansible | changed: [m1.test.com]
2022-04-05 12:10:00,239 p=3484089 u=root n=ansible | changed: [m2.test.com]
2022-04-05 12:10:00,262 p=3484089 u=root n=ansible | changed: [m3.test.com]
2022-04-05 12:10:00,469 p=3484089 u=root n=ansible | changed: [r1.test.com]
2022-04-05 12:10:00,503 p=3484089 u=root n=ansible | changed: [r2.test.com]
2022-04-05 12:10:00,514 p=3484089 u=root n=ansible | changed: [r3.test.com]
2022-04-05 12:10:00,665 p=3484089 u=root n=ansible | changed: [r4.test.com]
2022-04-05 12:10:00,750 p=3484089 u=root n=ansible | changed: [r5.test.com]
2022-04-05 12:10:38,459 p=3484089 u=root n=ansible | fatal: [o1.test.com]: FAILED! => changed=false 
  attempts: 3
  failures:
  - No package cephadm available.
  msg: Failed to install some of the specified packages
  rc: 1
  results: []
2022-04-05 12:10:38,510 p=3484089 u=root n=ansible | fatal: [o2.test.com]: FAILED! => changed=false 
  attempts: 3
  failures:
  - No package cephadm available.
  msg: Failed to install some of the specified packages
  rc: 1
  results: []
2022-04-05 12:10:39,601 p=3484089 u=root n=ansible | fatal: [o3.test.com]: FAILED! => changed=false 
  attempts: 3
  failures:
  - No package cephadm available.
  msg: Failed to install some of the specified packages
  rc: 1
  results: []
2022-04-05 12:10:39,602 p=3484089 u=root n=ansible | NO MORE HOSTS LEFT **************************************************************************************************************************************************************************************************************************
2022-04-05 12:10:39,602 p=3484089 u=root n=ansible | PLAY RECAP **********************************************************************************************************************************************************************************************************************************
2022-04-05 12:10:39,602 p=3484089 u=root n=ansible | m1.test.com : ok=14   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | m2.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | m3.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | r1.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | r2.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | r3.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | r4.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | r5.test.com : ok=12   changed=3    unreachable=0    failed=0    skipped=19   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | o1.test.com : ok=10   changed=1    unreachable=0    failed=1    skipped=20   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | o2.test.com : ok=10   changed=1    unreachable=0    failed=1    skipped=20   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | o3.test.com : ok=10   changed=1    unreachable=0    failed=1    skipped=20   rescued=0    ignored=0   
2022-04-05 12:10:39,603 p=3484089 u=root n=ansible | localhost                  : ok=1    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   


Version-Release number of selected component (if applicable):
ceph-ansible-6.0.20.2-1

How reproducible:
Observed in test environment in a multi-site cluster upgrade.

Steps to Reproduce:

1. Have dedicated OSD hosts in the cluster like for example :

# cat hosts 
[mons]
m1.test.com
m2.test.com
m3.test.com

[mgrs]
m1.test.com
m2.test.com
m3.test.com

[osds]
o1.test.com
o2.test.com
o3.test.com

[rgws]
r1.test.com
r2.test.com
r3.test.com
r4.test.com

[grafana-server]
r5.test.com

2. After a successful rolling_update.yml run : ansible-playbook infrastructure-playbooks/cephadm-adopt.yml -i hosts

3. The above failure is noticed because the tools repo is not enabled on the OSD nodes and it fails to install the cephadm package.

Actual results:
The playbook fails on OSD nodes only, because it doesn't enable the tools repository on OSD nodes.

Expected results:
adopt playbook should enable the tools repository on OSD nodes as well.

Additional info:
After the tools repo (rhceph-5-tools-for-rhel-8-x86_64-rpms) is enabled on the OSD nodes, the playbook succeeds.

Comment 17 errata-xmlrpc 2022-06-30 20:54:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5450


Note You need to log in before you can comment on or make changes to this bug.