Description of problem: Undercloud installation failure in step "Start containers for step 1" due to "cannot chdir: Permission denied" Version-Release number of selected component (if applicable): How reproducible: Always Actual results: Failing with "cannot chdir: Permission denied" Expected results: Undercloud install should work Additional info: # openstack undercloud install fails with below error last log is: TASK [Debug output for task: Start containers for step 1] ************************************************************************************************* 2020-01-08 16:50:47.678 57001 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] Wednesday 08 January 2020 16:50:47 +0100 (0:00:02.275) 0:21:21.291 ***** 2020-01-08 16:50:47.726 57001 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] fatal: [hostname]: FAILED! => { 2020-01-08 16:50:47.727 57001 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "failed_when_result": true, 2020-01-08 16:50:47.727 57001 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": [ 2020-01-08 16:50:47.727 57001 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "cannot chdir: Permission denied", In the messages file, it corresponds to the execution of the following line Jan 8 16:50:46 hostname python3[71223]: ansible-paunch Invoked with config=/var/lib/tripleo-config/hashed-container-startup-config-step_1.json config_id=['tripleo_step1'] action=apply container_cli=podman container_log_stdout_path=/var/log/containers/stdouts healthcheck_disabled=False managed_by=tripleo-Undercloud debug=False log_file=/var/log/paunch.log The paunch.log shows errros: 2020-01-08 16:50:46.834 71223 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=container_name=memcached', '--filter', 'label=config_id=tripleo_step1', '--format', '{{.Names}}']" - retrying without config_id 2020-01-08 16:50:46.872 71223 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=container_name=memcached', '--format', '{{.Names}}']" 2020-01-08 16:50:46.910 71223 ERROR paunch [ ] Error running ['podman', 'create', '--name', 'memcached', '--label', 'config_id=tripleo_step1', '--label', 'container_name=memcached', '--label', 'managed_by=tripleo-Undercloud', '--label', 'config_data={"command": ["/bin/bash", "-c", "source /etc/sysconfig/memcached; /usr/bin/memcached -p ${PORT} -u ${USER} -m ${CACHESIZE} -c ${MAXCONN} $OPTIONS"], "healthcheck": {"test": "/openstack/healthcheck"}, "image": "hostname.ctlplane.localdomain:8787/rhosp-beta/openstack-memcached:16.0-62", "net": "host", "privileged": false, "restart": "always", "start_order": 0, "volumes": ["/etc/hosts:/etc/hosts:ro", "/etc/localtime:/etc/localtime:ro", "/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro", "/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro", "/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro", "/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro", "/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro", "/dev/log:/dev/log", "/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro", "/etc/puppet:/etc/puppet:ro", "/var/lib/config-data/memcached/etc/sysconfig/memcached:/etc/sysconfig/memcached:ro"]}', '--conmon-pidfile=/var/run/memcached.pid', '--detach=true', '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/containers/stdouts/memcached.log', '--net=host', '--privileged=false', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/config-data/memcached/etc/sysconfig/memcached:/etc/sysconfig/memcached:ro', '--cpuset-cpus=0,1,2,3', 'hostname.ctlplane.localdomain:8787/rhosp-beta/openstack-memcached:16.0-62', '/bin/bash', '-c', 'source /etc/sysconfig/memcached; /usr/bin/memcached -p ${PORT} -u ${USER} -m ${CACHESIZE} -c ${MAXCONN} $OPTIONS']. [1] 2020-01-08 16:50:46.910 71223 ERROR paunch [ ] stdout: 2020-01-08 16:50:46.911 71223 ERROR paunch [ ] stderr: cannot chdir: Permission denied Things tried so far - Tried setting selinux to Permissive - We have tried to set 'become: true' in the playbook which was failing - We have cleaned up older containers and images.
Can you have the customer attempt to run 'sudo podman system migrate' and rerun the install. We're not currently able to reproduce this issue. Can the customer provide any additional details about the initial configuration of the undercloud. Was there any specific system hardening or other configurations that were applied prior to attempting to install? Can the customer also check the file permissions in /var/lib/containers/ and /var/lib/containers/storage/. Specifically they can run "sudo find /var/lib/containers/storage/{overlay-layers,overlay-images,overlay-containers,mounts,libpod,tmp} -ls"
Also, see https://bugzilla.redhat.com/show_bug.cgi?id=1768355 which sounds similar to this error.
I haven't been able to reproduce the issue with a clean install. However I was given access to a box showing the issue and we attempted an upgrade to podman 1.6 which will be what is used by OSP16 GA. This appears to have cleared up the issue but I'm continuing to try and reproduce. Unfortunately that means the RC might not work for some folks at this time. I'm still trying to track down specifically is causing the issue as I've seen it work on a retry or if you manually pull the containers.
Hi, Another customer is also facing a similar issue on RHOSP-15.
Since this is affecting both OSP15 and OSP16 rc, it's likely an issue with podman 1.4 that we currently ship. I believe 1.6 will be coming out in next few weeks which may address this issue. I'll raise a bz against podman to see if we can get further details. There doesn't appear to be anything that OSP can do about this at the moment.
Closing this out as we have shipped out 1.6 and I haven't seen any issues like this again. If this is still an issue, please reopen this bug and provide updated logs.