Bug 1742169
| Summary: | Scale up fails on all controller nodes with error"Job for tripleo_memcached-dmtfeqpf.service failed because the service did not take the steps required by its unit configuration" | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Eliad Cohen <elicohen> | ||||||||
| Component: | python-paunch | Assignee: | Steve Baker <sbaker> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | nlevinki <nlevinki> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 15.0 (Stein) | CC: | amodi, atonner, elicohen, emacchi, lmiccini, lyarwood, mburns, mcornea, michele, mschuppe, pkomarov, sbaker, sclewis | ||||||||
| Target Milestone: | rc | Keywords: | Regression, Triaged | ||||||||
| Target Release: | 15.0 (Stein) | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | python-paunch-4.5.1-0.20190829080435.f9349e0.el8ost | Doc Type: | No Doc Update | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2019-09-21 11:24:25 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 1690784, 1737456 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Eliad Cohen
2019-08-16 15:08:46 UTC
Created attachment 1604649 [details]
undercloud var folder
Created attachment 1604653 [details]
controller files
Tested using build 17 and 18 in : https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/ceph/view/rhos/job/DFG-ceph-rhos-15_director-rhel-virthost-3cont_1_to_2comp_1_to_3ceph-ipv4-geneve-scale-up/ It seems like there is an issue with the systemd unit file being generated (?), if I look at our lab we have "tripleo_memcached.service": [root@controller-0 ~]# cat /etc/systemd/system/tripleo_memcached.service [Unit] Description=memcached container After=paunch-container-shutdown.service Wants= [Service] Restart=always ExecStart=/usr/bin/podman start memcached ExecStop=/usr/bin/podman stop -t 10 memcached KillMode=none Type=forking PIDFile=/var/run/memcached.pid [Install] WantedBy=multi-user.target while here: $ cat tripleo_memcached-a9pap7zv.service [Unit] Description=memcached-a9pap7zv container After=paunch-container-shutdown.service Wants= [Service] Restart=always ExecStart=/usr/bin/podman start memcached-a9pap7zv ExecStop=/usr/bin/podman stop -t 10 memcached-a9pap7zv KillMode=none Type=forking PIDFile=/var/run/memcached-a9pap7zv.pid [Install] WantedBy=multi-user.target issue is that paunch/podman are creating a container with a bogus name:
"Start container memcached.",
"$ podman create --name memcached-a9pap7zv --label config_id=tripleo_step1 --label container_name=memcached --label ...
will try to reproduce.
Reproduced and spent some time with Michele trying to figure it out. This looks like the same as in: https://bugs.launchpad.net/tripleo/+bug/1839929 fixed by (stein): https://review.opendev.org/#/c/676984/ latest puddle's paunch version does not include the patch above. *** Bug 1743402 has been marked as a duplicate of this bug. *** *** Bug 1744675 has been marked as a duplicate of this bug. *** [root@controller-0 heat-admin]# rpm -q python3-paunch python3-paunch-4.5.1-0.20190829080435.f9349e0.el8ost.noarch Scale out completed successfully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |