Description of problem: When upgrading the controllers nodes from OSP13 to OSP16.0 we first need to run a external-upgrade step to change all cheph systemd units to run from Docker to Podman. The command used to achieve this is the following: openstack overcloud external-upgrade run \ --stack qe-Cloud-0 \ --tags ceph_systemd \ -e ceph_ansible_limit=controller-0 2>&1 However, as we come from an OSP13 installation, /var/lib/mistral doesn't have execution rights at world level, having 750 rights: (qe-Cloud-0) [stack@undercloud-0 ~]$ sudo ls -larth /var/lib/mistral/ total 44K drwx------. 2 42430 42430 31 Apr 13 16:51 .ssh -r--r--r--. 1 42430 42430 1001 Apr 13 16:51 undercloud.conf drwxr-xr-x. 3 42430 42430 78 Apr 13 17:59 .novaclient drwxr-xr-x. 77 root root 4.0K Apr 13 18:00 .. drwxr-xr-x. 12 42430 42430 4.0K Apr 14 10:48 4405e3f5-0ae1-40b3-95c8-a496cde410fb drwxr-xr-x. 2 42430 42430 4.0K Apr 14 11:01 ansible_fact_cache drwxr-xr-x. 13 42430 42430 4.0K Apr 14 11:02 6321adf8-5417-4113-bdd4-03206a1d987e drwxr-xr-x. 12 42430 42430 4.0K Apr 14 12:43 239d393a-fc93-44f5-8000-a5d701eced8f drwxr-xr-x. 12 42430 42430 4.0K Apr 14 14:03 4003182b-47b6-45ad-8c83-83e675071fb7 drwxr-xr-x. 13 42430 42430 4.0K Apr 14 14:10 33704f84-99ee-442c-b500-e82bca250c42 drwxr-xr-x. 12 42430 42430 4.0K Apr 15 10:29 eae3da95-e0a1-427c-be87-70887b264c78 lrwxrwxrwx. 1 42430 42430 53 Apr 15 10:34 config-download-latest -> /var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144 drwxr-x---. 12 42430 42430 4.0K Apr 15 10:34 . drwxr-xr-x. 13 42430 42430 4.0K Apr 15 10:35 1f0536f5-fd4b-45d8-b970-6fa69ba22144 This causes the external-upgrade (which connects as tripleo-admin user into the Undercloud) to fail when accessing any file inside /var/lib/mistral: 2020-04-15 10:35:58 | TASK [tripleo-ceph-common : set calling_ansible_environment_variables] ********* 2020-04-15 10:35:58 | Wednesday 15 April 2020 10:35:47 -0400 (0:00:00.863) 0:01:04.770 ******* 2020-04-15 10:35:58 | skipping: [undercloud] => {"changed": false, "skip_reason": "Conditional result was False"} 2020-04-15 10:35:58 | 2020-04-15 10:35:58 | TASK [create ceph-ansible working direcotry] *********************************** 2020-04-15 10:35:58 | Wednesday 15 April 2020 10:35:48 -0400 (0:00:00.946) 0:01:05.717 ******* 2020-04-15 10:35:58 | 2020-04-15 10:35:58 | TASK [tripleo-ceph-work-dir : create ceph-ansible temp dirs] ******************* 2020-04-15 10:35:58 | Wednesday 15 April 2020 10:35:51 -0400 (0:00:03.146) 0:01:08.864 ******* 2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003} 2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/group_vars) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/group_vars", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/group_vars", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003} 2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/host_vars) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/host_vars", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/host_vars", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003} 2020-04-15 10:35:58 | changed: [undercloud] => (item=/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/fetch_dir) => {"ansible_loop_var": "item", "changed": true, "gid": 0, "group": "root", "item": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/fetch_dir", "mode": "0755", "owner": "tripleo-admin", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/fetch_dir", "secontext": "unconfined_u:object_r:container_file_t:s0", "size": 6, "state": "directory", "uid": 1003} 2020-04-15 10:35:58 | 2020-04-15 10:35:58 | TASK [tripleo-ceph-work-dir : symbolic link to tripleo inventory from ceph-ansible work directory] *** 2020-04-15 10:35:58 | Wednesday 15 April 2020 10:35:55 -0400 (0:00:04.092) 0:01:12.957 ******* 2020-04-15 10:35:58 | fatal: [undercloud]: FAILED! => {"changed": false, "msg": "Error while linking: [Errno 13] Permission denied: b'/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/inventory.yaml' -> b'/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/inventory.yml'", "path": "/var/lib/mistral/1f0536f5-fd4b-45d8-b970-6fa69ba22144/ceph-ansible/inventory.yml"} The ceph-ansible creation is allowed though, because become: true is used in the task: [root@undercloud-0 stack]# ls -larth /var/lib/mistral/config-download-latest/ [20/1951] total 1.1M drwxr-xr-x. 7 42430 42430 128 Apr 15 10:34 .git -rw-r--r--. 1 42430 42430 9 Apr 15 10:34 .gitignore drwxr-xr-x. 2 42430 42430 4.0K Apr 15 10:34 BlockStorage drwxr-xr-x. 5 42430 42430 4.0K Apr 15 10:34 CephStorage drwxr-xr-x. 4 42430 42430 4.0K Apr 15 10:34 Compute drwxr-xr-x. 5 42430 42430 4.0K Apr 15 10:34 Controller drwxr-xr-x. 2 42430 42430 4.0K Apr 15 10:34 ObjectStorage -rw-r--r--. 1 42430 42430 3.8K Apr 15 10:34 all_nodes_validation_script.sh -rw-r--r--. 1 42430 42430 1.9K Apr 15 10:34 common_deploy_steps_playbooks.yaml -rw-r--r--. 1 42430 42430 8.7K Apr 15 10:34 common_deploy_steps_tasks.yaml -rw-r--r--. 1 42430 42430 14K Apr 15 10:34 common_deploy_steps_tasks_step_1.yaml -rw-r--r--. 1 42430 42430 6.8K Apr 15 10:34 container_puppet_script.yaml -rw-r--r--. 1 42430 42430 573 Apr 15 10:34 container_startup_configs_tasks.yaml -rw-r--r--. 1 42430 42430 805 Apr 15 10:34 deploy-artifacts.sh -rw-r--r--. 1 42430 42430 37K Apr 15 10:34 deploy_steps_playbook.yaml -rw-r--r--. 1 42430 42430 2.3K Apr 15 10:34 deploy_steps_tasks_step_0.yaml -rw-r--r--. 1 42430 42430 8.5K Apr 15 10:34 deployments.yaml -rw-r--r--. 1 42430 42430 21K Apr 15 10:34 docker_puppet_script.yaml -rw-r--r--. 1 42430 42430 40K Apr 15 10:34 external_deploy_steps_tasks.yaml -rw-r--r--. 1 42430 42430 1.1K Apr 15 10:34 external_post_deploy_steps_tasks.yaml -rw-r--r--. 1 42430 42430 5.5K Apr 15 10:34 external_update_steps_playbook.yaml -rw-r--r--. 1 42430 42430 217 Apr 15 10:34 external_update_steps_tasks.yaml -rw-r--r--. 1 42430 42430 7.6K Apr 15 10:34 external_upgrade_steps_playbook.yaml -rw-r--r--. 1 42430 42430 8.9K Apr 15 10:34 external_upgrade_steps_tasks.yaml -rw-r--r--. 1 42430 42430 796 Apr 15 10:34 fast_forward_upgrade_bootstrap_role_tasks.yaml -rw-r--r--. 1 42430 42430 130 Apr 15 10:34 fast_forward_upgrade_bootstrap_tasks.yaml -rw-r--r--. 1 42430 42430 515 Apr 15 10:34 fast_forward_upgrade_playbook.yaml -rw-r--r--. 1 42430 42430 922 Apr 15 10:34 fast_forward_upgrade_post_role_tasks.yaml -rw-r--r--. 1 42430 42430 621 Apr 15 10:34 fast_forward_upgrade_prep_role_tasks.yaml -rw-r--r--. 1 42430 42430 3.6K Apr 15 10:34 fast_forward_upgrade_prep_tasks.yaml -rw-r--r--. 1 42430 42430 113 Apr 15 10:34 fast_forward_upgrade_release_tasks.yaml -rw-r--r--. 1 42430 42430 4.6K Apr 15 10:34 generate-config-tasks.yaml -rw-r--r--. 1 42430 42430 13K Apr 15 10:34 global_vars.yaml -rw-r--r--. 1 42430 42430 679 Apr 15 10:34 hiera_steps_tasks.yaml -rw-r--r--. 1 42430 42430 3.7K Apr 15 10:34 host-container-puppet-tasks.yaml drwxr-xr-x. 2 42430 42430 142 Apr 15 10:34 host_vars -rw-r--r--. 1 42430 42430 575 Apr 15 10:34 post_update_steps_tasks.yaml -rw-r--r--. 1 42430 42430 611 Apr 15 10:34 post_upgrade_steps_playbook.yaml -rw-r--r--. 1 42430 42430 581 Apr 15 10:34 post_upgrade_steps_tasks.yaml -rw-r--r--. 1 42430 42430 2.2K Apr 15 10:34 pre_upgrade_rolling_steps_playbook.yaml -rw-r--r--. 1 42430 42430 616 Apr 15 10:34 pre_upgrade_rolling_steps_tasks.yaml -rw-r--r--. 1 42430 42430 703K Apr 15 10:34 qe-Cloud-0-config.tar.gz -rw-r--r--. 1 42430 42430 2.1K Apr 15 10:34 scale_playbook.yaml -rw-r--r--. 1 42430 42430 2.2K Apr 15 10:34 scale_steps_tasks.yaml drwxr-xr-x. 2 42430 42430 28 Apr 15 10:34 templates -rw-r--r--. 1 42430 42430 6.8K Apr 15 10:34 update_steps_playbook.yaml -rw-r--r--. 1 42430 42430 551 Apr 15 10:34 update_steps_tasks.yaml -rw-r--r--. 1 42430 42430 7.3K Apr 15 10:34 upgrade_steps_playbook.yaml drwxr-x---. 12 42430 42430 4.0K Apr 15 10:34 .. -rw-r--r--. 1 42430 42430 13K Apr 15 10:34 inventory.yaml -rw-------. 1 42430 42430 1.7K Apr 15 10:34 ssh_private_key -rw-r--r--. 1 42430 42430 2.1K Apr 15 10:34 ansible.cfg -rwxr-x---. 1 42430 42430 758 Apr 15 10:34 ansible-playbook-command.sh drwxr-xr-x. 2 42430 42430 80 Apr 15 10:35 group_vars drwxr-xr-x. 5 tripleo-admin root 58 Apr 15 10:35 ceph-ansible drwxr-xr-x. 13 42430 42430 4.0K Apr 15 10:35 . drwx------. 2 42430 42430 6 Apr 15 11:05 ansible-ssh [root@undercloud-0 stack]# su - tripleo-admin [tripleo-admin@undercloud-0 ~]$ ls -larth /var/lib/mistral/config-download-latest/ceph-ansible ls: cannot access '/var/lib/mistral/config-download-latest/ceph-ansible': Permission denied This issue isn't happening in OSP16.0/16.1 deployment, because /var/lib/mistral is deployed with 0755 rights there. So tripleo-admin has access to anything within /var/lib/mistral. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
We are doing exactly that, the problem here is Mistral. We have the code in place to do what you said, but that can happen only if we disable mistral when running the command via --no-workflow option https://github.com/openstack/python-tripleoclient/blob/stable/train/tripleoclient/v1/overcloud_external_upgrade.py#L107-L110 The default behavior is to invoke the tripleo.package_update.v1.update_nodes mistral action (https://github.com/openstack/python-tripleoclient/blob/stable/train/tripleoclient/utils.py#L1225 -> https://github.com/openstack/python-tripleoclient/blob/stable/train/tripleoclient/workflows/package_update.py#L117), which invokes the config-download from mistral_executor container https://github.com/openstack/tripleo-common/blob/stable/train/workbooks/package_update.yaml#L147-L153. Also, with the current scenario we have, tripleo-admin wouldn't be able to config-download anything in /var/lib/mistral as it does not belong to mistral group, plus as most of the deployment code works also with mistral it looks like /var/lib/mistral itself is owned by mistral's user in the container: [root@undercloud-0 stack]# ls -larth /var/lib/mistral total 60K drwx------. 2 42430 42430 31 Apr 16 09:30 .ssh -r--r--r--. 1 42430 42430 1007 Apr 16 09:30 undercloud.conf drwxr-xr-x. 3 42430 42430 78 Apr 16 10:00 .novaclient drwxr-xr-x. 75 root root 4.0K Apr 16 10:01 .. drwxr-xr-x. 2 42430 42430 4.0K Apr 16 12:12 ansible_fact_cache drwxr-xr-x. 12 42430 42430 4.0K Apr 16 12:19 8856c978-1b25-42ec-b90f-2322be8cc6d1 drwxr-xr-x. 13 42430 42430 4.0K Apr 16 12:25 9b9c15cb-2f4e-45bb-9c7d-82810c61fbee drwxr-xr-x. 13 42430 42430 4.0K Apr 16 12:26 fd29e4ba-f73f-45bf-99f9-83e73c117da8 drwxr-xr-x. 13 42430 42430 4.0K Apr 16 12:26 c3fb5c06-bb97-4ac3-be07-3ea842ddf4f5 drwxr-xr-x. 12 42430 42430 4.0K Apr 16 12:54 b9cb84ff-f599-44e2-a885-5760122fec3b drwxr-xr-x. 12 42430 42430 4.0K Apr 16 13:30 649c3285-9264-4fa5-ad44-eb39c6d3a1e2 drwxr-xr-x. 12 42430 42430 4.0K Apr 16 17:41 637093ab-8481-4a66-b802-54994a07e865 drwxr-xr-x. 13 42430 42430 4.0K Apr 16 17:42 bba3c24c-1fd3-4656-8c17-70ce1918702e drwxr-xr-x. 13 42430 42430 4.0K Apr 16 17:43 8e15a3b5-c897-4bd6-bceb-2653ba5223a2 lrwxrwxrwx. 1 42430 42430 53 Apr 16 17:43 config-download-latest -> /var/lib/mistral/399e6c31-ad4f-498e-aa98-aacc7c6ffad8 drwxr-xr-x. 13 42430 42430 4.0K Apr 16 17:43 399e6c31-ad4f-498e-aa98-aacc7c6ffad8 -rw-------. 1 42430 42430 198 Apr 20 10:10 .bash_history drwxr-x---. 15 42430 42430 4.0K Apr 20 10:10 . So, no chance for tripleo-admin to create anything inside under given sudo powers.
Hey Giulio, Yes, it was expected to be verified in the hackfest. However, as the puddle we used was pretty old (the passed_phase2 one which included this patch was causing trouble) the workflow got executed with workarounds. In fact some folks faced this issue, probably because the workaroudn didn't apply properly the patch. Give us few days more and we'll try to verify it with our CI job.
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3148