Update OSP-D from 7.0 to 7.1 Failed : systemd stop functioning on the controller node (Failed to get D-Bus connection) Environment : -------------- Controller: ------------- dbus-1.6.12-11.el7.x86_64 dbus-glib-0.100-7.el7.x86_64 dbus-python-1.1.1-9.el7.x86_64 dbus-libs-1.6.12-11.el7.x86_64 python-slip-dbus-0.4.0-2.el7.noarch Undercloud: ------------ instack-undercloud-2.1.2-29.el7ost.noarch instack-0.0.7-1.el7ost.noarch openstack-heat-templates-0-0.6.20150605git.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-45.el7ost.noarch openstack-heat-api-2015.1.0-4.el7ost.noarch openstack-heat-api-cfn-2015.1.1-6.el7ost.noarch heat-cfntools-1.2.8-2.el7.noarch openstack-heat-common-2015.1.0-4.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.1-6.el7ost.noarch openstack-heat-api-cfn-2015.1.0-4.el7ost.noarch python-heatclient-0.6.0-1.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.0-4.el7ost.noarch openstack-heat-common-2015.1.1-6.el7ost.noarch openstack-heat-api-2015.1.1-6.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-71.el7ost.noarch openstack-heat-engine-2015.1.1-6.el7ost.noarch openstack-heat-engine-2015.1.0-4.el7ost.noarch Description : ------------- It happened after applying this patch : https://review.openstack.org/#/c/239368/ to workaround :https://bugzilla.redhat.com/show_bug.cgi?id=1274859 and then attempted to update ospd UC+OC from 7.0 to 7.1 Steps: ------- (1) Install Undercloud and Overcloud 7.0 (with 7.0 Images) (2) Update the undercloud to 7.1 ( using rhos-release ) (3) make sure you have 7.1 repos on the overcloud nodes (4) attempt to run the overcloud update command : (More details: http://etherpad.corp.redhat.com/update-ospd-7-0-to-7-1 ) openstack overcloud update stack overcloud -i --templates -e /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml -e /home/stack/update.yaml Results: --------- (1)It looks like during the yum update that was running on the controller - one package failed to update : 59/363 \nFailed to get D-Bus connect /run/systemd/private: No such file or directory\nwarning: %post(glusterfs-3.7.1-16.el7.x86_64) scriptlet failed, exit status 1\n (2) then during the update systemctl stopped functioning on the controller machie : [root@overcloud-controller-0 ~]# systemctl Failed to get D-Bus connection: Failed to connect to socket /run/systemd/private: No such file or directory (3) Overcloud 'Update failed' ---------------------------------------------------------- [root@overcloud-controller-0 ~]# ps auxf|grep systemd root 1 0.3 0.0 51260 2340 ? Ss Oct22 28:11 /usr/lib/systemd/systemd --system --deserialize 27 root 346 0.2 0.5 80496 20016 ? Ss Oct22 20:04 /usr/lib/systemd/systemd-journald root 437 0.0 0.0 0 0 ? Zs Oct22 3:17 [systemd-logind] <defunct> dbus 438 0.1 0.0 100492 2024 ? Ssl Oct22 7:06 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation root 3444 0.0 0.0 112640 928 pts/0 S+ 15:43 0:00 \_ grep --color=auto systemd /var/log/messages (from controller) ------------------------------------ \n tzdata.noarch 0:2015g-1.el7 \n util-linux.x86_64 0:2.23.2-22.el7_1.1 \n\nComplete!\nyum return code: 0\nStarting cluster node\nStarting Cluster...\nRedirecting to /bin/systemctl start corosync.service\nFailed to get D-Bu to socket /run/systemd/private: No such file or directory\n\nERROR overcloud-controller-0 failed to join cluster in 360 seconds\n", "deploy_stderr": "Non-fata m package glusterfs-3.7.1-16.el7.x86_64\nNon-fatal POSTUN scriptlet failure in rpm package glusterfs-3.6.0.29-2.el7.x86_64\nError: unable to start corosync\nE unning on this node\nError: cluster is not currently running on this node\nError: cluster is not currently running on this node\nError: cluster is not current cluster is not currently running on this node\nError: cluster is not currently running on this node\nError: cluster is not currently running on this node\nErr ning on this node\nError: cluster is not currently running on this node\nError: cluster is not currently running on this node\nError: cluster is not currently uster is not currently running on this node\nError: cluster is not currently running on this node\nError: cluster is not currently running on this node\nError ng on this node\nError: cluster is not currently running on this node\nError: cluster is not currently running on this node\nError: cluster is not currently r ter is not currently running on this node\nError: cluster is not currently running on this node\nError: cluster is not currently running on this node\nError: on this no
I think systemd got into a broken state on the controller node: [root@overcloud-controller-0 ~]# systemctl Failed to get D-Bus connection: Failed to connect to socket /run/systemd/private: No such file or directory Because systemctl doesn't work also any services can't be started.
Created attachment 1087011 [details] messages Adding messages file from controller
i also saw some cluster related errors during an update attempt: https://bugzilla.redhat.com/show_bug.cgi?id=1278004 the puppet reapply is happening due to: https://bugzilla.redhat.com/show_bug.cgi?id=1278181 though it's still unclear why reapplying the puppet causes these errors
Update from 7.0 to 7.2 is working. Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:2651