Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1634810

Summary:	[OSP14] Rebooting a clustered control node without previously stopping pacemaker takes more than 15 minutes
Product:	Red Hat OpenStack	Reporter:	Michele Baldessari <michele>
Component:	puppet-tripleo	Assignee:	Emilien Macchi <emacchi>
Status:	CLOSED ERRATA	QA Contact:	pkomarov
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	14.0 (Rocky)	CC:	agurenko, chjones, dvd, emacchi, jjoyce, jschluet, mburns, michele, nwahl, pkomarov, sbradley, slinaber, tvignaud
Target Milestone:	beta	Keywords:	Triaged, ZStream
Target Release:	14.0 (Rocky)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	puppet-tripleo-9.3.1-0.20181001112251.a6eaab1.el7ost python-paunch-3.2.0-0.20180921003258.6d2ec11.el7ost	Doc Type:	No Doc Update
Doc Text:	Cause: A faulty interaction between rhel-plugin-push.service and the docker service during system shutdown. Consequence: A long time is needed to reboot a controller Fix: Correct shutdown ordering is enforced for these two services. Result: Rebooting a controller takes a more reasonable amount of times (couple of minutes).	Story Points:	---
Clone Of:	1628705	Environment:
Last Closed:	2019-01-11 11:53:30 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1628705
Bug Blocks:

Comment 4 pkomarov 2018-10-18 07:34:39 UTC

Verified , 

[stack@undercloud-0 ~]$ cat core_puddle_version 
2018-10-10.3

[stack@undercloud-0 ~]$ ansible controller -mshell -b -a'ls -l /etc/systemd/system/resource-agents-deps.target.wants'
 [WARNING]: Found both group and host with same name: undercloud

#check fix: 
controller-1 | SUCCESS | rc=0 >>
total 0
lrwxrwxrwx. 1 root root 38 Oct 17 13:02 docker.service -> /usr/lib/systemd/system/docker.service
lrwxrwxrwx. 1 root root 48 Oct 17 13:02 rhel-push-plugin.service -> /usr/lib/systemd/system/rhel-push-plugin.service

controller-0 | SUCCESS | rc=0 >>
total 0
lrwxrwxrwx. 1 root root 38 Oct 17 13:02 docker.service -> /usr/lib/systemd/system/docker.service
lrwxrwxrwx. 1 root root 48 Oct 17 13:02 rhel-push-plugin.service -> /usr/lib/systemd/system/rhel-push-plugin.service

controller-2 | SUCCESS | rc=0 >>
total 0
lrwxrwxrwx. 1 root root 38 Oct 17 13:02 docker.service -> /usr/lib/systemd/system/docker.service
lrwxrwxrwx. 1 root root 48 Oct 17 13:02 rhel-push-plugin.service -> /usr/lib/systemd/system/rhel-push-plugin.service


#disable the fencing as in the reproducer: 

[root@controller-0 ~]# pcs property set stonith-enabled=false
[root@controller-0 ~]# pcs config|grep stonith-enabled
 stonith-enabled: false



[root@controller-0 ~]# date
Thu Oct 18 07:19:49 UTC 2018

[root@controller-0 ~]# reboot

#A simple ssh test proves that after no so mush as 3 min the controller is back online:

(undercloud) [stack@undercloud-0 ~]$ while true ; do nc -zv 192.168.24.15 22;sleep 10s;done
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection refused.
...
Ncat: Connected to 192.168.24.15:22.
Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 192.168.24.15:22.

Comment 8 errata-xmlrpc 2019-01-11 11:53:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045