Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1742169

Summary: Scale up fails on all controller nodes with error"Job for tripleo_memcached-dmtfeqpf.service failed because the service did not take the steps required by its unit configuration"
Product: Red Hat OpenStack Reporter: Eliad Cohen <elicohen>
Component: python-paunchAssignee: Steve Baker <sbaker>
Status: CLOSED ERRATA QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: high    
Version: 15.0 (Stein)CC: amodi, atonner, elicohen, emacchi, lmiccini, lyarwood, mburns, mcornea, michele, mschuppe, pkomarov, sbaker, sclewis
Target Milestone: rcKeywords: Regression, Triaged
Target Release: 15.0 (Stein)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: python-paunch-4.5.1-0.20190829080435.f9349e0.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:24:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1690784, 1737456    
Attachments:
Description Flags
Undercloud files minus var folder
none
undercloud var folder
none
controller files none

Description Eliad Cohen 2019-08-16 15:08:46 UTC
Created attachment 1604646 [details]
Undercloud files minus var folder

Description of problem:
When performing a scale up, the process terminates with error [1] on all controller nodes

Version-Release number of selected component (if applicable):
OSP15 core_puddle: RHOS_TRUNK-15.0-RHEL-8-20190813.n.0
CEPH compose: ceph-4.0-rhel-8-containers-candidate-64389-20190813102853

How reproducible:
100%

Steps to Reproduce:
1. Deploy osp with ceph 3 controller, 1 compute 1 ceph nodes
2. Scale up to 3,2,3 accordingly
3. Error on scale up script execution

Actual results:
Scale up fails with error

Expected results:
Scale up should succeed

Additional info:
[1] http://pastebin.test.redhat.com/789335

Comment 1 Eliad Cohen 2019-08-16 15:09:58 UTC
Created attachment 1604649 [details]
undercloud var folder

Comment 2 Eliad Cohen 2019-08-16 15:10:40 UTC
Created attachment 1604653 [details]
controller files

Comment 4 Luca Miccini 2019-08-19 08:15:30 UTC
It seems like there is an issue with the systemd unit file being generated (?), if I look at our lab we have "tripleo_memcached.service":

[root@controller-0 ~]# cat /etc/systemd/system/tripleo_memcached.service
[Unit]
Description=memcached container
After=paunch-container-shutdown.service
Wants=
[Service]
Restart=always
ExecStart=/usr/bin/podman start memcached
ExecStop=/usr/bin/podman stop -t 10 memcached
KillMode=none
Type=forking
PIDFile=/var/run/memcached.pid

[Install]
WantedBy=multi-user.target


while here:

$ cat tripleo_memcached-a9pap7zv.service
[Unit]
Description=memcached-a9pap7zv container
After=paunch-container-shutdown.service
Wants=
[Service]
Restart=always
ExecStart=/usr/bin/podman start memcached-a9pap7zv
ExecStop=/usr/bin/podman stop -t 10 memcached-a9pap7zv
KillMode=none
Type=forking
PIDFile=/var/run/memcached-a9pap7zv.pid

[Install]
WantedBy=multi-user.target

Comment 5 Luca Miccini 2019-08-19 09:48:04 UTC
issue is that paunch/podman are creating a container with a bogus name:

        "Start container memcached.",
        "$ podman create --name memcached-a9pap7zv --label config_id=tripleo_step1 --label container_name=memcached --label ...


will try to reproduce.

Comment 6 Luca Miccini 2019-08-19 12:42:35 UTC
Reproduced and spent some time with Michele trying to figure it out.

This looks like the same as in: 

https://bugs.launchpad.net/tripleo/+bug/1839929

fixed by (stein):

https://review.opendev.org/#/c/676984/

latest puddle's paunch version does not include the patch above.

Comment 7 Michele Baldessari 2019-08-20 05:40:55 UTC
*** Bug 1743402 has been marked as a duplicate of this bug. ***

Comment 9 Michele Baldessari 2019-08-22 17:02:33 UTC
*** Bug 1744675 has been marked as a duplicate of this bug. ***

Comment 15 Marius Cornea 2019-08-30 23:17:30 UTC
[root@controller-0 heat-admin]# rpm -q python3-paunch
python3-paunch-4.5.1-0.20190829080435.f9349e0.el8ost.noarch

Scale out completed successfully.

Comment 19 errata-xmlrpc 2019-09-21 11:24:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811