Description of problem: During minor update of OSP 14 on RHEL7.6 during pre-upgrade tasks keepalived goes down and does not restart. Docker hangs. Upgrade stalls and times out. Version-Release number of selected component (if applicable): Installed repositories (rhel-7.6): 14 ceph-3 ceph-osd-3 rhel-7.6 docker-1.13.1-91.git07f3374.el7.x86_64 containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1) runc version: 290a33602b16ff2d1cc5339bc0297f0e094462ce (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f) init version: fec3683b971d9c3ef73f284f176672c44b448662 (expected: 949e6facb77383876aeff8a6944dde66b3089574) 192.168.24.1:8787/rhosp14/openstack-keepalived 2019-03-06.1 d2f2329685cf 8 days ago 535 MB How reproducible: Build RHOS14 undercloud:1,controller:3,compute:1,ceph:3,ironic:2 and run a minor update against the stack. Steps to Reproduce: Actual results: ar 12 10:58:44 undercloud-0.redhat.local dockerd-current[516710]: Unexpected argument(s): | tee /var/log/keepalived.log Mar 12 10:58:44 undercloud-0.redhat.local dockerd-current[516710]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 Mar 12 10:58:44 undercloud-0.redhat.local dockerd-current[516710]: Stopped Mar 12 10:58:44 undercloud-0.redhat.local Keepalived[38986]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 Mar 12 10:58:44 undercloud-0.redhat.local dockerd-current[516710]: Unexpected argument(s): | tee /var/log/keepalived.log Mar 12 10:58:44 undercloud-0.redhat.local Keepalived_vrrp[39033]: Stopped Mar 12 10:58:44 undercloud-0.redhat.local dockerd-current[516710]: time="2019-03-12T10:58:44.164609694-04:00" level=debug msg="attach: stderr: begin" Expected results: Update completes normally Additional info:
Having a look at the docker version this looks to me very similar to the issue described in https://bugzilla.redhat.com/show_bug.cgi?id=1671861 . Could we have more logs please? Was there a restart of the docker service? if that could be confirmed then we can ensure this is a duplicate of 1671861.
A good way to verify if we're facing the same issue is to login in some of the overcloud controller nodes and perform a "docker ps", if the command hangs then it's exactly the same issue described in 1671861.
*** This bug has been marked as a duplicate of bug 1671861 ***