Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1545443

Summary: Permission denied when executing rolling_update playbook after following doc
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Coady LaCroix <clacroix>
Component: DocumentationAssignee: Aron Gunn <agunn>
Status: CLOSED CURRENTRELEASE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.0CC: adeza, agunn, aschoen, ceph-eng-bugs, clacroix, gmeno, hnallurv, kdreyer, nthomas, sankarshan, vakulkar
Target Milestone: z1   
Target Release: 3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-26 06:55:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ansible-playbook output
none
cephci-output none

Description Coady LaCroix 2018-02-14 22:48:29 UTC
Description of problem:
Encountering permission error when running rolling_update playbook as per the documentation. Work is being done in /usr/share/ceph-ansible and playbook is being executed as 'cephuser' with the following command:

ansible-playbook -e ireallymeanit=yes -vv -i hosts rolling_update.yml


TASK [ceph-defaults : create a local fetch directory if it does not exist] ****************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-defaults/tasks/facts.yml:53
fatal: [ceph-clacroix-run927-node1-mon -> localhost]: FAILED! => {"changed": false, "msg": "There was an issue creating fetch as requested: [Errno 13] Permission denied: 'fetch'", "path": "fetch/", "state": "absent"}


Version-Release number of selected component (if applicable):
Name        : ceph-ansible
Arch        : noarch
Version     : 3.0.25
Release     : 1.el7cp
Size        : 872 k
Repo        : installed
From repo   : ceph-Tools
Summary     : Ansible playbooks for Ceph
URL         : https://github.com/ceph/ceph-ansible
License     : ASL 2.0 and GPLv3+
Description : Ansible playbooks for Ceph


How reproducible:


Steps to Reproduce:
1. Install ceph 2.x
2. Follow documentation on upgrading ceph stack from 2.x to 3.0 (doc link below)
3. From /usr/share/ceph-ansible, run the rolling_update playbook as the ansible user (in our case this was 'cephuser')

Actual results:
Permission denied on creating local fetch directory

Expected results:
Playbook to execute successfully

Additional info:
RHEL Upgrade guide: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/installation_guide_for_red_hat_enterprise_linux/upgrading-a-red-hat-ceph-storage-cluster

Comment 3 Coady LaCroix 2018-02-14 23:00:29 UTC
Created attachment 1396175 [details]
ansible-playbook output

Attached full output from ansible-playbook

Comment 4 Andrew Schoen 2018-02-15 14:32:22 UTC
Coady,

It looks like your value for 'fetch_directory' is still the default value of 'fetch/'. This needs to be changed to a writeable location, it's trying to create a /usr/share/ceph-ansible/fetch directory which will fail.

Could you try setting `fetch_directory` to a writeable location and trying again?

Thanks,
Andrew

Comment 5 Coady LaCroix 2018-02-15 18:43:48 UTC
Created attachment 1396720 [details]
cephci-output

Comment 6 Coady LaCroix 2018-02-15 18:49:35 UTC
Hi Andrew,

The missing fetch_directory config makes sense, and it looks like adding that gets the rolling_update past that issue. However it is not very clear in any part of the install or upgrade doc that it is necessary. Docs have you do all of your work in /usr/share/ceph-ansible which is owned by root. However, it is intended that the playbook is executed by the ansible user. This means by default the update will fail unless this value is set, therefore this may be something we should add to documentation.

I have also attached the output from my cephci run in which I have hit another issue in running the update playbook. I initialy hit this running the playbook manually on the installer node, and was able to reproduce it through cephci automation. 


TASK [ceph-osd : activate osd(s) when device is a disk] ************************
task path: /usr/share/ceph-ansible/roles/ceph-osd/tasks/activate_osds.yml:5

failed: [ceph-clacroix-run174-node5-osd] (item=/dev/vdb) => {"changed": false, "cmd": ["ceph-disk", "activate", "/dev/vdb1"], "delta": "0:00:00.130945", "end": "2018-02-15 12:23:48.815212", "item": "/dev/vdb", "msg": "non-zero return code", "rc": 1, "start": "2018-02-15 12:23:48.684267", "stderr": "mount_activate: Failed to activate\nceph-disk: Error: No cluster conf found in /etc/ceph with fsid 6f5270ff-e3e6-4f26-a1ac-1220a749a0e5", "stderr_lines": ["mount_activate: Failed to activate", "ceph-disk: Error: No cluster conf found in /etc/ceph with fsid 6f5270ff-e3e6-4f26-a1ac-1220a749a0e5"], "stdout": "", "stdout_lines": []}

Comment 7 Andrew Schoen 2018-02-15 19:01:03 UTC
(In reply to Coady LaCroix from comment #6)
> Hi Andrew,
> 
> The missing fetch_directory config makes sense, and it looks like adding
> that gets the rolling_update past that issue. However it is not very clear
> in any part of the install or upgrade doc that it is necessary. Docs have
> you do all of your work in /usr/share/ceph-ansible which is owned by root.
> However, it is intended that the playbook is executed by the ansible user.
> This means by default the update will fail unless this value is set,
> therefore this may be something we should add to documentation.

I agree that there should be a documentation change for this.

> 
> I have also attached the output from my cephci run in which I have hit
> another issue in running the update playbook. I initialy hit this running
> the playbook manually on the installer node, and was able to reproduce it
> through cephci automation. 
> 
> 
> TASK [ceph-osd : activate osd(s) when device is a disk]
> ************************
> task path: /usr/share/ceph-ansible/roles/ceph-osd/tasks/activate_osds.yml:5
> 
> failed: [ceph-clacroix-run174-node5-osd] (item=/dev/vdb) => {"changed":
> false, "cmd": ["ceph-disk", "activate", "/dev/vdb1"], "delta":
> "0:00:00.130945", "end": "2018-02-15 12:23:48.815212", "item": "/dev/vdb",
> "msg": "non-zero return code", "rc": 1, "start": "2018-02-15
> 12:23:48.684267", "stderr": "mount_activate: Failed to activate\nceph-disk:
> Error: No cluster conf found in /etc/ceph with fsid
> 6f5270ff-e3e6-4f26-a1ac-1220a749a0e5", "stderr_lines": ["mount_activate:
> Failed to activate", "ceph-disk: Error: No cluster conf found in /etc/ceph
> with fsid 6f5270ff-e3e6-4f26-a1ac-1220a749a0e5"], "stdout": "",
> "stdout_lines": []}

Coady,

I took a look at your log from ceph-ci and you've set fetch_directory to '/home/cephuser/fetch' for the rolling_update.yml run but left it as the default value for the initial deployment. The value for fetch_directory must be the same here or otherwise ceph-ansible will not know it's working against an existing cluster and try to generate another fsid. I think this is why you are seeing the error "ceph-disk: Error: No cluster conf found in /etc/ceph with fsid 6f5270ff-e3e6-4f26-a1ac-1220a749a0e".

Could you ensure that the value of fetch_directory is the same here for all playbook runs and try again?

Thanks,
Andrew

Comment 8 Coady LaCroix 2018-02-16 19:39:03 UTC
Andrew,

It seems that was the issue, and now with that value set for the install as well as the upgrade I'm able to execute the playbooks fully. 

Thanks!

Comment 9 Vasu Kulkarni 2018-02-19 22:53:09 UTC
Moving this for Doc updates.