Bug 1824266 - [OSP13->16.0] ceph-ansible preparation tasks with no permission to read files in /var/lib/mistral
Summary: [OSP13->16.0] ceph-ansible preparation tasks with no permission to read files...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 16.1 (Train on RHEL 8.2)
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-15 16:39 UTC by Jose Luis Franco
Modified: 2023-02-22 23:02 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200530033438.0dfce4e.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-29 07:51:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 717320 0 None MERGED Make /var/lib/mistral traversable by all users 2020-10-23 17:11:17 UTC
OpenStack gerrit 725884 0 None MERGED Make /var/lib/mistral traversable by all users 2020-10-23 17:11:04 UTC
Red Hat Issue Tracker OSP-20518 0 None None None 2022-11-29 09:16:21 UTC
Red Hat Product Errata RHBA-2020:3148 0 None None None 2020-07-29 07:52:08 UTC

Description Jose Luis Franco 2020-04-15 16:39:32 UTC
Description of problem:

When upgrading the controllers nodes from OSP13 to OSP16.0 we first need to run a external-upgrade step to change all cheph systemd units to run from Docker to Podman. The command used to achieve this is the following:

openstack overcloud external-upgrade run \
        --stack qe-Cloud-0 \
        --tags ceph_systemd \
        -e ceph_ansible_limit=controller-0 2>&1

However, as we come from an OSP13 installation, /var/lib/mistral doesn't have execution rights at world level, having 750 rights:

(qe-Cloud-0) [stack@undercloud-0 ~]$ sudo ls -larth /var/lib/mistral/
total 44K
drwx------.  2 42430 42430   31 Apr 13 16:51 .ssh
-r--r--r--.  1 42430 42430 1001 Apr 13 16:51 undercloud.conf
drwxr-xr-x.  3 42430 42430   78 Apr 13 17:59 .novaclient
drwxr-xr-x. 77 root  root  4.0K Apr 13 18:00 ..
drwxr-xr-x. 12 42430 42430 4.0K Apr 14 10:48 4405e3f5-0ae1-40b3-95c8-a496cde410fb
drwxr-xr-x.  2 42430 42430 4.0K Apr 14 11:01 ansible_fact_cache
drwxr-xr-x. 13 42430 42430 4.0K Apr 14 11:02 6321adf8-5417-4113-bdd4-03206a1d987e
drwxr-xr-x. 12 42430 42430 4.0K Apr 14 12:43 239d393a-fc93-44f5-8000-a5d701eced8f
drwxr-xr-x. 12 42430 42430 4.0K Apr 14 14:03 4003182b-47b6-45ad-8c83-83e675071fb7
drwxr-xr-x. 13 42430 42430 4.0K Apr 14 14:10 33704f84-99ee-442c-b500-e82bca250c42
drwxr-xr-x. 12 42430 42430 4.0K Apr 15 10:29 eae3da95-e0a1-427c-be87-70887b264c78
lrwxrwxrwx.  1 42430 42430   53 Apr 15 10:34 config-download-latest -> /var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144
drwxr-x---. 12 42430 42430 4.0K Apr 15 10:34 .
drwxr-xr-x. 13 42430 42430 4.0K Apr 15 10:35 1f0536f5-fd4b-45d8-b970-6fa69ba22144

This causes the external-upgrade (which connects as tripleo-admin user into the Undercloud) to fail when accessing any file inside /var/lib/mistral:

2020-04-15 10:35:58 | TASK [tripleo-ceph-common : set calling_ansible_environment_variables] *********
2020-04-15 10:35:58 | Wednesday 15 April 2020  10:35:47 -0400 (0:00:00.863)       0:01:04.770 *******
2020-04-15 10:35:58 | skipping: [undercloud] => {"changed": false, "skip_reason": "Conditional result was False"}
2020-04-15 10:35:58 |
2020-04-15 10:35:58 | TASK [create ceph-ansible working direcotry] ***********************************
2020-04-15 10:35:58 | Wednesday 15 April 2020  10:35:48 -0400 (0:00:00.946)       0:01:05.717 *******
2020-04-15 10:35:58 |
2020-04-15 10:35:58 | TASK [tripleo-ceph-work-dir : create ceph-ansible temp dirs] *******************
2020-04-15 10:35:58 | Wednesday 15 April 2020  10:35:51 -0400 (0:00:03.146)       0:01:08.864 *******
2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003}
2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/group_vars) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/group_vars", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/group_vars", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003}
2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/host_vars) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/host_vars", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/host_vars", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003}
2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/fetch_dir) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/fetch_dir", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/fetch_dir", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003}
2020-04-15 10:35:58 |
2020-04-15 10:35:58 | TASK [tripleo-ceph-work-dir : symbolic link to tripleo inventory from ceph-ansible work directory] ***
2020-04-15 10:35:58 | Wednesday 15 April 2020  10:35:55 -0400 (0:00:04.092)       0:01:12.957 *******
2020-04-15 10:35:58 | fatal: [undercloud]: FAILED! => {"changed": false, "msg": "Error while linking: [Errno 13] Permission denied: b'/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/inventory.yaml' -> b'/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/inventory.yml'", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/inventory.yml"}


The ceph-ansible creation is allowed though, because become: true is used in the task:


[root@undercloud-0 stack]# ls -larth /var/lib/mistral/config-download-latest/                                                                                      [20/1951]
total 1.1M
drwxr-xr-x.  7         42430 42430  128 Apr 15 10:34 .git
-rw-r--r--.  1         42430 42430    9 Apr 15 10:34 .gitignore                                                                                                             
drwxr-xr-x.  2         42430 42430 4.0K Apr 15 10:34 BlockStorage                                                                                                           
drwxr-xr-x.  5         42430 42430 4.0K Apr 15 10:34 CephStorage
drwxr-xr-x.  4         42430 42430 4.0K Apr 15 10:34 Compute
drwxr-xr-x.  5         42430 42430 4.0K Apr 15 10:34 Controller
drwxr-xr-x.  2         42430 42430 4.0K Apr 15 10:34 ObjectStorage
-rw-r--r--.  1         42430 42430 3.8K Apr 15 10:34 all_nodes_validation_script.sh
-rw-r--r--.  1         42430 42430 1.9K Apr 15 10:34 common_deploy_steps_playbooks.yaml                                                                                     
-rw-r--r--.  1         42430 42430 8.7K Apr 15 10:34 common_deploy_steps_tasks.yaml
-rw-r--r--.  1         42430 42430  14K Apr 15 10:34 common_deploy_steps_tasks_step_1.yaml                                                                                  
-rw-r--r--.  1         42430 42430 6.8K Apr 15 10:34 container_puppet_script.yaml
-rw-r--r--.  1         42430 42430  573 Apr 15 10:34 container_startup_configs_tasks.yaml                                                                                   
-rw-r--r--.  1         42430 42430  805 Apr 15 10:34 deploy-artifacts.sh
-rw-r--r--.  1         42430 42430  37K Apr 15 10:34 deploy_steps_playbook.yaml
-rw-r--r--.  1         42430 42430 2.3K Apr 15 10:34 deploy_steps_tasks_step_0.yaml
-rw-r--r--.  1         42430 42430 8.5K Apr 15 10:34 deployments.yaml
-rw-r--r--.  1         42430 42430  21K Apr 15 10:34 docker_puppet_script.yaml
-rw-r--r--.  1         42430 42430  40K Apr 15 10:34 external_deploy_steps_tasks.yaml
-rw-r--r--.  1         42430 42430 1.1K Apr 15 10:34 external_post_deploy_steps_tasks.yaml                                                                                  
-rw-r--r--.  1         42430 42430 5.5K Apr 15 10:34 external_update_steps_playbook.yaml
-rw-r--r--.  1         42430 42430  217 Apr 15 10:34 external_update_steps_tasks.yaml
-rw-r--r--.  1         42430 42430 7.6K Apr 15 10:34 external_upgrade_steps_playbook.yaml
-rw-r--r--.  1         42430 42430 8.9K Apr 15 10:34 external_upgrade_steps_tasks.yaml
-rw-r--r--.  1         42430 42430  796 Apr 15 10:34 fast_forward_upgrade_bootstrap_role_tasks.yaml
-rw-r--r--.  1         42430 42430  130 Apr 15 10:34 fast_forward_upgrade_bootstrap_tasks.yaml
-rw-r--r--.  1         42430 42430  515 Apr 15 10:34 fast_forward_upgrade_playbook.yaml
-rw-r--r--.  1         42430 42430  922 Apr 15 10:34 fast_forward_upgrade_post_role_tasks.yaml
-rw-r--r--.  1         42430 42430  621 Apr 15 10:34 fast_forward_upgrade_prep_role_tasks.yaml
-rw-r--r--.  1         42430 42430 3.6K Apr 15 10:34 fast_forward_upgrade_prep_tasks.yaml
-rw-r--r--.  1         42430 42430  113 Apr 15 10:34 fast_forward_upgrade_release_tasks.yaml
-rw-r--r--.  1         42430 42430 4.6K Apr 15 10:34 generate-config-tasks.yaml
-rw-r--r--.  1         42430 42430  13K Apr 15 10:34 global_vars.yaml
-rw-r--r--.  1         42430 42430  679 Apr 15 10:34 hiera_steps_tasks.yaml
-rw-r--r--.  1         42430 42430 3.7K Apr 15 10:34 host-container-puppet-tasks.yaml
drwxr-xr-x.  2         42430 42430  142 Apr 15 10:34 host_vars
-rw-r--r--.  1         42430 42430  575 Apr 15 10:34 post_update_steps_tasks.yaml
-rw-r--r--.  1         42430 42430  611 Apr 15 10:34 post_upgrade_steps_playbook.yaml
-rw-r--r--.  1         42430 42430  581 Apr 15 10:34 post_upgrade_steps_tasks.yaml
-rw-r--r--.  1         42430 42430 2.2K Apr 15 10:34 pre_upgrade_rolling_steps_playbook.yaml
-rw-r--r--.  1         42430 42430  616 Apr 15 10:34 pre_upgrade_rolling_steps_tasks.yaml
-rw-r--r--.  1         42430 42430 703K Apr 15 10:34 qe-Cloud-0-config.tar.gz
-rw-r--r--.  1         42430 42430 2.1K Apr 15 10:34 scale_playbook.yaml
-rw-r--r--.  1         42430 42430 2.2K Apr 15 10:34 scale_steps_tasks.yaml
drwxr-xr-x.  2         42430 42430   28 Apr 15 10:34 templates
-rw-r--r--.  1         42430 42430 6.8K Apr 15 10:34 update_steps_playbook.yaml
-rw-r--r--.  1         42430 42430  551 Apr 15 10:34 update_steps_tasks.yaml
-rw-r--r--.  1         42430 42430 7.3K Apr 15 10:34 upgrade_steps_playbook.yaml
drwxr-x---. 12         42430 42430 4.0K Apr 15 10:34 ..
-rw-r--r--.  1         42430 42430  13K Apr 15 10:34 inventory.yaml
-rw-------.  1         42430 42430 1.7K Apr 15 10:34 ssh_private_key
-rw-r--r--.  1         42430 42430 2.1K Apr 15 10:34 ansible.cfg
-rwxr-x---.  1         42430 42430  758 Apr 15 10:34 ansible-playbook-command.sh
drwxr-xr-x.  2         42430 42430   80 Apr 15 10:35 group_vars
drwxr-xr-x.  5 tripleo-admin root    58 Apr 15 10:35 ceph-ansible
drwxr-xr-x. 13         42430 42430 4.0K Apr 15 10:35 .
drwx------.  2         42430 42430    6 Apr 15 11:05 ansible-ssh


[root@undercloud-0 stack]# su - tripleo-admin
[tripleo-admin@undercloud-0 ~]$ ls -larth /var/lib/mistral/config-download-latest/ceph-ansible
ls: cannot access '/var/lib/mistral/config-download-latest/ceph-ansible': Permission denied

This issue isn't happening in OSP16.0/16.1 deployment, because /var/lib/mistral is deployed with 0755 rights there. So tripleo-admin has access to anything within /var/lib/mistral.
Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Jose Luis Franco 2020-04-20 16:25:57 UTC
We are doing exactly that, the problem here is Mistral. We have the code in place to do what you said, but that can happen only if we disable mistral when running the command via --no-workflow option https://github.com/openstack/python-tripleoclient/blob/stable/train/tripleoclient/v1/overcloud_external_upgrade.py#L107-L110

The default behavior is to invoke the tripleo.package_update.v1.update_nodes mistral action (https://github.com/openstack/python-tripleoclient/blob/stable/train/tripleoclient/utils.py#L1225 -> https://github.com/openstack/python-tripleoclient/blob/stable/train/tripleoclient/workflows/package_update.py#L117), which invokes the config-download from mistral_executor container https://github.com/openstack/tripleo-common/blob/stable/train/workbooks/package_update.yaml#L147-L153.

Also, with the current scenario we have, tripleo-admin wouldn't be able to config-download anything in /var/lib/mistral as it does not belong to mistral group, plus as most of the deployment code works also with mistral it looks like /var/lib/mistral itself is owned by mistral's user in the container:

[root@undercloud-0 stack]# ls -larth /var/lib/mistral
total 60K
drwx------.  2 42430 42430   31 Apr 16 09:30 .ssh
-r--r--r--.  1 42430 42430 1007 Apr 16 09:30 undercloud.conf
drwxr-xr-x.  3 42430 42430   78 Apr 16 10:00 .novaclient
drwxr-xr-x. 75 root  root  4.0K Apr 16 10:01 ..
drwxr-xr-x.  2 42430 42430 4.0K Apr 16 12:12 ansible_fact_cache
drwxr-xr-x. 12 42430 42430 4.0K Apr 16 12:19 8856c978-1b25-42ec-b90f-2322be8cc6d1
drwxr-xr-x. 13 42430 42430 4.0K Apr 16 12:25 9b9c15cb-2f4e-45bb-9c7d-82810c61fbee
drwxr-xr-x. 13 42430 42430 4.0K Apr 16 12:26 fd29e4ba-f73f-45bf-99f9-83e73c117da8
drwxr-xr-x. 13 42430 42430 4.0K Apr 16 12:26 c3fb5c06-bb97-4ac3-be07-3ea842ddf4f5
drwxr-xr-x. 12 42430 42430 4.0K Apr 16 12:54 b9cb84ff-f599-44e2-a885-5760122fec3b
drwxr-xr-x. 12 42430 42430 4.0K Apr 16 13:30 649c3285-9264-4fa5-ad44-eb39c6d3a1e2
drwxr-xr-x. 12 42430 42430 4.0K Apr 16 17:41 637093ab-8481-4a66-b802-54994a07e865
drwxr-xr-x. 13 42430 42430 4.0K Apr 16 17:42 bba3c24c-1fd3-4656-8c17-70ce1918702e
drwxr-xr-x. 13 42430 42430 4.0K Apr 16 17:43 8e15a3b5-c897-4bd6-bceb-2653ba5223a2
lrwxrwxrwx.  1 42430 42430   53 Apr 16 17:43 config-download-latest -> /var/lib/mistral/399e6c31-ad4f-498e-aa98-aacc7c6ffad8
drwxr-xr-x. 13 42430 42430 4.0K Apr 16 17:43 399e6c31-ad4f-498e-aa98-aacc7c6ffad8
-rw-------.  1 42430 42430  198 Apr 20 10:10 .bash_history
drwxr-x---. 15 42430 42430 4.0K Apr 20 10:10 .

So, no chance for tripleo-admin to create anything inside under given sudo powers.

Comment 12 Jose Luis Franco 2020-06-15 16:26:12 UTC
Hey Giulio,

Yes, it was expected to be verified in the hackfest. However, as the puddle we used was pretty old (the passed_phase2 one which included this patch was causing trouble) the workflow got executed with workarounds. In fact some folks faced this issue, probably because the workaroudn didn't apply properly the patch.

Give us few days more and we'll try to verify it with our CI job.

Comment 13 Alex McLeod 2020-06-16 12:32:15 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.

Comment 16 errata-xmlrpc 2020-07-29 07:51:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148


Note You need to log in before you can comment on or make changes to this bug.