Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1809998

Summary: tripleo_memcached_healthcheck.service: Failed with result 'exit-code'
Product: Red Hat OpenStack Reporter: Filip Hubík <fhubik>
Component: openstack-tripleo-commonAssignee: Adriano Petrich <apetrich>
Status: CLOSED CURRENTRELEASE QA Contact: Alexander Chuzhoy <sasha>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: mburns, michele, ramishra, slinaber, wznoinsk
Target Milestone: betaKeywords: Reopened, Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-12.1.1-0.20200305093509.7e5b011.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-11 18:18:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
install-undercloud.log
none
paunch.log none

Description Filip Hubík 2020-03-04 11:22:14 UTC
Created attachment 1667458 [details]
install-undercloud.log

Description of problem:
UC deployment doesn't seem to be able to start some containers, the log shows:

...
TASK [Wait for containers to start for step 2 using paunch] ********************
Wednesday 04 March 2020  01:42:15 -0500 (0:00:00.366)       0:08:22.837 *******
FAILED - RETRYING: Wait for containers to start for step 2 using paunch (1200 retries left).
FAILED - RETRYING: Wait for containers to start for step 2 using paunch (1199 retries left).
FAILED - RETRYING: Wait for containers to start for step 2 using paunch (1198 retries left).
...
fatal: [undercloud-0]: FAILED! => {"ansible_job_id": "482077024820.33737", "attempts": 26, "changed": false, "finished": 1, "msg": "Paunch failed with config_id tripleo_step2", "rc": 1, "stderr": "Did not find container with \"['podman', '
ps', '-a', '--filter', 'label=container_name=glance_init_logs', '--filter', 'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying without config_id\nDid not find container with \"['podman', 'ps', '-a', '--filter', 'label=
container_name=glance_init_logs', '--format', '{{.Names}}']\"\nDid not find container with...
...

Several container logs output:
+ podman top rabbitmq_init_logs user pid ppid pcpu vsz tty state time etime args
Error: top can only be used on running containers
+ podman exec --user root rabbitmq_init_logs top -bwn1
Error: cannot exec into container that is not running: container state improper

OR

Error: Evaluation Error: Error while evaluating a Function Call, Could not find class ::tripleo::profile::base::neutron::l3_agent_wrappers for undercloud-0.redhat.local (line: 1, column: 27) on node undercloud-0.redhat.local (e.g. /var/log/extra/containers/containers/create_keepalived_wrapper/stdout.log)

Error: Evaluation Error: Error while evaluating a Function Call, Could not find class ::tripleo::profile::base::neutron::dhcp_agent_wrappers for undercloud-0.redhat.local (line: 1, column: 27) on node undercloud-0.redhat.local (containers/stdouts/create_dnsmasq_wrapper.log)

puppet-user: Error: Facter: error while resolving custom fact "rabbitmq_nodename": undefined method `[]' for nil:NilClass (containers/stdouts/container-puppet-rabbitmq.log)

VRRP_Script(haproxy) failed (exited with status 1 (containers/keepalived/keepalived.log)

Also memcached service failed:
podman[37582]: 2020-03-04 06:46:08.093091786 +0000 UTC m=+0.278868341 container exec 90fb440d08112b11add2d117c1f2ba8dad0b0d2be2ed9f6c0a0aa4f9cee59ab1 (image=undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-memcached:20200303.1, name=memcached)
healthcheck_memcached[37582]: /openstack/healthcheck: line 3: wrap_ipv6: command not found
healthcheck_memcached[37582]: 2020/03/04 06:46:08 socat[32] E getaddrinfo("", "NULL", {1,0,1,6}, {}): Name or service not known
healthcheck_memcached[37582]: Error: non zero exit code: 1: OCI runtime error
systemd[1]: tripleo_memcached_healthcheck.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: tripleo_memcached_healthcheck.service: Failed with result 'exit-code'.
systemd[1]: Failed to start memcached healthcheck

Version-Release number of selected component (if applicable):
OSP16, RHOS_TRUNK-16.0-RHEL-8-20200304.n.0

Logs attached.

Comment 1 Filip Hubík 2020-03-04 11:22:46 UTC
Created attachment 1667459 [details]
paunch.log

Comment 3 Michele Baldessari 2020-03-04 11:33:12 UTC

*** This bug has been marked as a duplicate of bug 1809939 ***