Description of problem: While trying to deploy RHOS 15, with the mentioned topology, overcloud deployment fails while trying to pull openstack-neutron-metadata-agent-ovn image. Topology: 1 controller, 2 compute, 3 ceph, 1 free ipa (TLS everywhere) How reproducible: always Steps to Reproduce: 1. Deploy RHOS 15 with mentioned topology 2. Overcloud deployment fails Actual results: "Version:", " Config: 1563921940", " Puppet: 5.5.10", "stderr: Trying to pull 192.168.24.1:8787/rhosp15/openstack-neutron-metadata-agent-ovn:20190722.1...Getting image source signatures", "Copying blob sha256:6b8395cc0f9d018f35b66c0205c33a57a81f05ea332c46583532e00145c5c4de", "Copying blob sha256:16ee413e2b8ddf33036e5c883dc9a93b585b84dd29203951e923d6d89584ffa4", "Copying config sha256:7df078ba924554aaab0746529efd6499c95277b8015f5da40973309be4e283bf", "+ STEP=4", "+ TAGS=file", "+ CONFIG='include ::tripleo::profile::base::neutron::ovn_metadata_agent_wrappers'", "+ EXTRA_ARGS=", "+ '[' -d /tmp/puppet-etc ']'", "+ cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet", "+ echo '{\"step\": 4}'", "+ export FACTER_deployment_type=containers", "+ FACTER_deployment_type=containers", "+ set +e", "+ puppet apply --verbose --detailed-exitcodes --summarize --color=false --modulepath /etc/puppet/modules:/opt/stack/puppet-modules:/usr/share/openstack-puppet/modules --tags file -e 'noop_resource('\\''package'\\''); include ::tripleo::profile::base::neutron::ovn_metadata_agent_wrappers'", "Warning: ModuleLoader: module 'tripleo' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules\\n (file & line not available)", "Warning: /etc/puppet/hiera.yaml: Use of 'hiera.yaml' version 3 is deprecated. It should be converted to version 5", " (file: /etc/puppet/hiera.yaml)", "Warning: Undefined variable '::deploy_config_name'; \\n (file & line not available)", "Warning: The function 'hiera' is deprecated in favor of using 'lookup'. See https://puppet.com/docs/puppet/5.5/deprecated_language.html\\n (file & line not available)", "+ rc=2", "+ set -e", "+ set +ux" ] } NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* ceph-0 : ok=136 changed=57 unreachable=0 failed=0 skipped=521 rescued=0 ignored=1 ceph-1 : ok=136 changed=57 unreachable=0 failed=0 skipped=521 rescued=0 ignored=1 ceph-2 : ok=136 changed=57 unreachable=0 failed=0 skipped=521 rescued=0 ignored=1 compute-0 : ok=168 changed=82 unreachable=0 failed=0 skipped=489 rescued=0 ignored=1 compute-1 : ok=168 changed=82 unreachable=0 failed=0 skipped=489 rescued=0 ignored=1 controller-0 : ok=222 changed=140 unreachable=0 failed=1 skipped=434 rescued=0 ignored=1 undercloud : ok=43 changed=24 unreachable=0 failed=0 skipped=60 rescued=0 ignored=0 Tuesday 23 July 2019 22:45:41 +0000 (0:00:00.408) 0:23:54.275 ********** =============================================================================== Overcloud configuration failed. Ansible failed, check log at /var/lib/mistral/overcloud/ansible.log. sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 33170), raddr=('192.168.24.2', 13808)> sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 43336)> sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 45778)> sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 50032), raddr=('192.168.24.2', 13989)> Expected results: Overcloud deployed successfully Additional info:
Hello, You're not pointing the right issue. Apparently, the real issue is related to another image: "$ podman pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest", "b'Trying to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest...Failed\\nerror pulling image \"192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest\": unable to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: unable to pull image: Error determining manifest MIME type for docker://192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: Error reading manifest pcmklatest in 192.168.24.1:8787/rhosp15/openstack-redis: error parsing HTTP 404 response body: invalid character \\'<\\' looking for beginning of value: \"<!DOCTYPE HTML PUBLIC \\\\\"-//IETF//DTD HTML 2.0//EN\\\\\">\\\\n<html><head>\\\\n<title>404 Not Found</title>\\\\n</head><body>\\\\n<h1>Not Found</h1>\\\\n<p>The requested URL /v2/rhosp15/openstack-redis/manifests/pcmklatest was not found on this server.</p>\\\\n</body></html>\\\\n\"\\n'", "Error pulling 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest. [125]", "stdout: ", "stderr: Trying to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest...Failed", "error pulling image \"192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest\": unable to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: unable to pull image: Error determining manifest MIME type for docker://192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: Error reading manifest pcmklatest in 192.168.24.1:8787/rhosp15/openstack-redis: error parsing HTTP 404 response body: invalid character '<' looking for beginning of value: \"<!DOCTYPE HTML PUBLIC \\\"-//IETF//DTD HTML 2.0//EN\\\">\\n<html><head>\\n<title>404 Not Found</title>\\n</head><body>\\n<h1>Not Found</h1>\\n<p>The requested URL /v2/rhosp15/openstack-redis/manifests/pcmklatest was not found on this server.</p>\\n</body></html>\\n\"" ] } This is for PIDONE then :). Will also update title accordingly. As for the last error related to unclosed ssl.SSLSocket, it's non-fatal and, iirc, a patch has been provided lately (maybe by bandini, not sure). Cheers, C.
Note: the patch correcting the unclosed resources is: https://review.opendev.org/#/q/Ib93a0edb2e789855aa9e5908130a03ffcd9439c2 - Alex did it in master, and Michele backported it in Stein. Should be present in osp-15 shortly hopefully.
(In reply to Cédric Jeanneret from comment #4) > Hello, > > You're not pointing the right issue. > Apparently, the real issue is related to another image: > > "$ podman pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest", > "b'Trying to pull > 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest...Failed\\nerror > pulling image \"192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest\": > unable to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: unable > to pull image: Error determining manifest MIME type for > docker://192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: Error reading > manifest pcmklatest in 192.168.24.1:8787/rhosp15/openstack-redis: error > parsing HTTP 404 response body: invalid character \\'<\\' looking for > beginning of value: \"<!DOCTYPE HTML PUBLIC \\\\\"-//IETF//DTD HTML > 2.0//EN\\\\\">\\\\n<html><head>\\\\n<title>404 Not > Found</title>\\\\n</head><body>\\\\n<h1>Not Found</h1>\\\\n<p>The requested > URL /v2/rhosp15/openstack-redis/manifests/pcmklatest was not found on this > server.</p>\\\\n</body></html>\\\\n\"\\n'", > "Error pulling 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest. > [125]", > "stdout: ", > "stderr: Trying to pull > 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest...Failed", > "error pulling image > \"192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest\": unable to pull > 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: unable to pull image: > Error determining manifest MIME type for > docker://192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: Error reading > manifest pcmklatest in 192.168.24.1:8787/rhosp15/openstack-redis: error > parsing HTTP 404 response body: invalid character '<' looking for beginning > of value: \"<!DOCTYPE HTML PUBLIC \\\"-//IETF//DTD HTML > 2.0//EN\\\">\\n<html><head>\\n<title>404 Not > Found</title>\\n</head><body>\\n<h1>Not Found</h1>\\n<p>The requested URL > /v2/rhosp15/openstack-redis/manifests/pcmklatest was not found on this > server.</p>\\n</body></html>\\n\"" > ] > } > > This is for PIDONE then :). Will also update title accordingly. > > As for the last error related to unclosed ssl.SSLSocket, it's non-fatal and, > iirc, a patch has been provided lately (maybe by bandini, not sure). > > Cheers, > > C. I'm not sure what the problem is yet. Generally, when one see that podman/docker is trying to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest, that means that the tagging of image hasn't been executed properly before the docker-puppet step for config generation is started. I'll dig into the logs to find out more and try to get my hands on an env
looking at the logs I can see that the tag seems to be created just fine when the image is downloaded originally: TASK [tripleo-container-tag : Tag 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest to latest 192.168.24.1:8787/rhosp15/openstack-redis:20190722.1 image] *** Tuesday 23 July 2019 22:30:29 +0000 (0:00:04.106) 0:08:42.306 ********** skipping: [ceph-0] => {"changed": false, "skip_reason": "Conditional result was False"} skipping: [ceph-1] => {"changed": false, "skip_reason": "Conditional result was False"} skipping: [ceph-2] => {"changed": false, "skip_reason": "Conditional result was False"} skipping: [compute-0] => {"changed": false, "skip_reason": "Conditional result was False"} changed: [controller-0] => {"changed": true, "cmd": "podman tag 192.168.24.1:8787/rhosp15/openstack-redis:20190722.1 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest", "delta": "0:00:00.062883", "end": "2019-07-23 22:30:30.224687", "rc": 0, "start": "2019-07-23 22:30:30.161804", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []} skipping: [compute-1] => {"changed": false, "skip_reason": "Conditional result was False"} But then later, something is trying to use it at step 2 and fails: TASK [Debug output for task: Start containers for step 2] ********************** Tuesday 23 July 2019 22:45:41 +0000 (0:00:25.159) 0:23:53.867 ********** fatal: [controller-0]: FAILED! => { "failed_when_result": true, [...] "$ podman image exists 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest", "$ podman pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest", "b'Trying to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest...Failed\\nerror pulling image \"192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest\": unable to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: unable to pull image: Error determining manifest MIME type for docker://192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: Error reading manifest pcmklatest in 192.168.24.1:8787/rhosp15/openstack-redis: error parsing HTTP 404 response body: invalid character \\'<\\' looking for beginning of value: \"<!DOCTYPE HTML PUBLIC \\\\\"-//IETF//DTD HTML 2.0//EN\\\\\">\\\\n<html><head>\\\\n<title>404 Not Found</title>\\\\n</head><body>\\\\n<h1>Not Found</h1>\\\\n<p>The requested URL /v2/rhosp15/openstack-redis/manifests/pcmklatest was not found on this server.</p>\\\\n</body></html>\\\\n\"\\n'", "Error pulling 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest. [125]", "stdout: ", "stderr: Trying to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest...Failed", "error pulling image \"192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest\": unable to pull 192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: unable to pull image: Error determining manifest MIME type for docker://192.168.24.1:8787/rhosp15/openstack-redis:pcmklatest: Error reading manifest pcmklatest in 192.168.24.1:8787/rhosp15/openstack-redis: error parsing HTTP 404 response body: invalid character '<' looking for beginning of value: \"<!DOCTYPE HTML PUBLIC \\\"-//IETF//DTD HTML 2.0//EN\\\">\\n<html><head>\\n<title>404 Not Found</title>\\n</head><body>\\n<h1>Not Found</h1>\\n<p>The requested URL /v2/rhosp15/openstack-redis/manifests/pcmklatest was not found on this server.</p>\\n</body></html>\\n\"" [...] Interestingly, only pacemaker should start containers tagged with pcmklatest. So I think the service template is wrong in Stein [1]: - redis_tls_proxy: start_order: 3 >>>> image: *redis_image_pcmklatest net: host user: root It should read {get_param: DockerRedisImage}, like it was back to Rocky. I'm gonna deploy an env locally to verify that statement [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/stein/deployment/database/redis-pacemaker-puppet.yaml#L269
fix merged in Master [1], start Stein backport [2] [1] https://review.opendev.org/675089/ [2] https://review.opendev.org/675394
*** Bug 1740939 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811