Description of problem: Notice: Finished catalog run in 89.68 seconds + rc=6 + set -e + echo 'puppet apply exited with exit code 6' puppet apply exited with exit code 6 + '[' 6 '!=' 2 -a 6 '!=' 0 ']' + exit 6 [2016-04-10 04:43:38,601] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 6] [2016-04-10 04:43:38,602] (os-refresh-config) [ERROR] Aborting... Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 815, in install _run_orc(instack_env) File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 699, in _run_orc _run_live_command(args, instack_env, 'os-refresh-config') File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 370, in _run_live_command raise RuntimeError('%s failed. See log for details.' % name) RuntimeError: os-refresh-config failed. See log for details. Command 'instack-install-undercloud' returned non-zero exit status 1 Version-Release number of selected component (if applicable): openstack-heat-engine-5.0.1-5.el7ost.noarch openstack-nova-scheduler-12.0.2-5.el7ost.noarch openstack-ironic-api-4.2.2-4.el7ost.noarch openstack-selinux-0.6.58-1.el7ost.noarch openstack-swift-container-2.5.0-2.el7ost.noarch python-django-openstack-auth-2.0.1-1.2.el7ost.noarch openstack-neutron-common-7.0.1-15.el7ost.noarch openstack-dashboard-theme-8.0.1-2.el7ost.noarch openstack-tempest-liberty-20160317.1.el7ost.noarch openstack-nova-novncproxy-12.0.2-5.el7ost.noarch openstack-ceilometer-collector-5.0.2-2.el7ost.noarch openstack-tuskar-ui-extras-0.0.4-2.el7ost.noarch openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch openstack-swift-plugin-swift3-1.9-1.el7ost.noarch openstack-ceilometer-common-5.0.2-2.el7ost.noarch openstack-tripleo-common-0.3.1-1.el7ost.noarch openstack-heat-api-5.0.1-5.el7ost.noarch openstack-nova-cert-12.0.2-5.el7ost.noarch openstack-nova-api-12.0.2-5.el7ost.noarch openstack-neutron-openvswitch-7.0.1-15.el7ost.noarch openstack-glance-11.0.1-4.el7ost.noarch openstack-keystone-8.0.1-1.el7ost.noarch openstack-swift-proxy-2.5.0-2.el7ost.noarch openstack-swift-object-2.5.0-2.el7ost.noarch openstack-tuskar-0.4.18-5.el7ost.noarch openstack-tripleo-image-elements-0.9.9-1.el7ost.noarch openstack-swift-2.5.0-2.el7ost.noarch openstack-ironic-common-4.2.2-4.el7ost.noarch openstack-nova-common-12.0.2-5.el7ost.noarch openstack-heat-common-5.0.1-5.el7ost.noarch openstack-heat-api-cfn-5.0.1-5.el7ost.noarch openstack-nova-compute-12.0.2-5.el7ost.noarch openstack-nova-conductor-12.0.2-5.el7ost.noarch openstack-neutron-7.0.1-15.el7ost.noarch openstack-ceilometer-central-5.0.2-2.el7ost.noarch openstack-ceilometer-alarm-5.0.2-2.el7ost.noarch openstack-swift-account-2.5.0-2.el7ost.noarch openstack-tripleo-0.0.7-1.el7ost.noarch openstack-dashboard-8.0.1-2.el7ost.noarch openstack-neutron-ml2-7.0.1-15.el7ost.noarch openstack-ceilometer-api-5.0.2-2.el7ost.noarch openstack-heat-templates-0-0.8.20150605git.el7ost.noarch openstack-tuskar-ui-0.4.0-5.el7ost.noarch openstack-utils-2014.2-1.el7ost.noarch openstack-tripleo-heat-templates-kilo-0.8.14-7.el7ost.noarch openstack-ceilometer-polling-5.0.2-2.el7ost.noarch python-openstackclient-1.7.2-1.el7ost.noarch openstack-heat-api-cloudwatch-5.0.1-5.el7ost.noarch openstack-nova-console-12.0.2-5.el7ost.noarch openstack-ironic-conductor-4.2.2-4.el7ost.noarch openstack-ironic-inspector-2.2.5-2.el7ost.noarch openstack-puppet-modules-7.0.17-1.el7ost.noarch redhat-access-plugin-openstack-7.0.0-0.el7ost.noarch openstack-tripleo-heat-templates-0.8.14-7.el7ost.noarch openstack-ceilometer-notification-5.0.2-2.el7ost.noarch instack-0.0.8-2.el7ost.noarch instack-undercloud-2.2.7-4.el7ost.noarch How reproducible: always Steps to Reproduce: 1. deploy 7.3 2. switch to rhos-release 8 puddle or poodle 3. yum -y update 4. openstack undercloud upgrade Actual results: see the error above Expected results: upgrade should work Additional info: [stack@instack ~]$ dib-run-parts /usr/libexec/os-refresh-config/configure.d dib-run-parts Sun Apr 10 15:07:44 EDT 2016 Running /usr/libexec/os-refresh-config/configure.d/00-apply-selinux-policy + set -o pipefail + '[' -x /usr/sbin/semanage ']' + semodule -i /opt/stack/selinux-policy/ipxe.pp semodule: SELinux policy is not managed or store cannot be accessed. Same result with setenforce 0 and 1 Also tried to reboot the undercloud node, rerun the upgrade command several times, switch between poodles and puddles.
what is the output of "sudo sestatus" on the undercloud?
My suspicion was that I missed the steps in https://bugzilla.redhat.com/show_bug.cgi?id=1312143, since the UC had SSL enabled. I redeployed the entire stack again, to have a clean experiment, installed 7.3 again as follows: Deployment command: openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 1 --neutron-tunnel-types vxlan,gre --neutron-network-type vxlan,gre --neutron-network-vlan-ranges datacentre:118:143 --neutron-bridge-mappings datacentre:br-ex --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e ~/ssl-heat-templates/environments/enable-tls.yaml -e ~/ssl-heat-templates/environments/inject-trust-anchor.yaml populated the OC with 5 tenants/instances/volumes/etc started a clean session, to avoid any remnants of overcloudrc being exported by the population script commands: sudo rhos-release -P 8-director sudo yum update -y sudo cp cacert.pem /etc/pki/ca-trust/source/anchors/ sudo update-ca-trust extract openstack undercloud upgrade ... this failed. I reran the command again, as a potential workaround for another bug where this command fails on the first run, which failed again: Notice: /Stage[main]/Heat::Deps/Anchor[heat::service::end]: Dependency Keystone_user[admin] has failures: true Warning: /Stage[main]/Heat::Deps/Anchor[heat::service::end]: Skipping because of failed dependencies Notice: Finished catalog run in 193.55 seconds + rc=6 + set -e + echo 'puppet apply exited with exit code 6' puppet apply exited with exit code 6 + '[' 6 '!=' 2 -a 6 '!=' 0 ']' + exit 6 [2016-04-12 02:27:46,784] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 6] [2016-04-12 02:27:46,784] (os-refresh-config) [ERROR] Aborting... Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 815, in install _run_orc(instack_env) File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 699, in _run_orc _run_live_command(args, instack_env, 'os-refresh-config') File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 370, in _run_live_command raise RuntimeError('%s failed. See log for details.' % name) RuntimeError: os-refresh-config failed. See log for details. Command 'instack-install-undercloud' returned non-zero exit status 1 [stack@instack ~]$ dib-run-parts /usr/libexec/os-refresh-config/configure.d dib-run-parts Tue Apr 12 02:44:37 EDT 2016 Running /usr/libexec/os-refresh-config/configure.d/00-apply-selinux-policy + set -o pipefail + '[' -x /usr/sbin/semanage ']' + semodule -i /opt/stack/selinux-policy/ipxe.pp semodule: SELinux policy is not managed or store cannot be accessed. [stack@instack ~]$ sudo sestatus SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Max kernel policy version: 28 I am keeping the system available for further troubleshooting, please ping me for access
I checked the environment and the br-ctlplane interface was missing the undercloud_public_vip address. Since the undercloud is SSL enabled then it was unable to reach the public APIs that use undercloud_public_vip. After manually adding the undercloud_public_vip to the br-ctlplane interface and rerunning the undercloud upgrade command it completed successfully: sudo ip addr add 192.0.2.2/24 dev br-ctlplane openstack undercloud upgrade
We just confirmed it, looks like undercloud upgrade or the yum update beforehand removes the UC VIP address. Maybe the BZ should be renamed... Also, if this is the case, why does UC upgrade work on most setups?
Indeed, we can see that the br-ctlplane interface is missing the VIPs on all environments but the undercloud upgrade fails only on SSL deployments since these are using a VIP for the OS_AUTH_URL. The VIPs are set by keepalived on the br-ctlplane interface and on my upgraded overcloud keepalived was reporting FAULT state for the VIP instances after the upgrade: Keepalived_vrrp[1256]: Kernel is reporting: interface br-ctlplane DOWN Keepalived_vrrp[1256]: VRRP_Instance(51) Entering FAULT STATE Keepalived_vrrp[1256]: VRRP_Instance(51) removing protocol VIPs. Keepalived_vrrp[1256]: Netlink: error: No such device, type=(21), seq=1460464571, pid=0 Keepalived_vrrp[1256]: VRRP_Instance(51) Now in FAULT state Keepalived_vrrp[1256]: Kernel is reporting: interface br-ctlplane DOWN Keepalived_vrrp[1256]: VRRP_Instance(52) Entering FAULT STATE Keepalived_vrrp[1256]: VRRP_Instance(52) removing protocol VIPs. Keepalived_vrrp[1256]: Netlink: error: No such device, type=(21), seq=1460464572, pid=0 Keepalived_vrrp[1256]: VRRP_Instance(52) Now in FAULT state After doing 'systemctl restart keepalived' the VIPs were set to the br-ctlplane interface: stack@instack:~>>> ip a s dev br-ctlplane 16: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 00:c2:0a:86:bf:66 brd ff:ff:ff:ff:ff:ff inet 192.0.2.1/24 brd 192.0.2.255 scope global br-ctlplane valid_lft forever preferred_lft forever inet 192.0.2.3/32 scope global br-ctlplane valid_lft forever preferred_lft forever inet 192.0.2.2/32 scope global br-ctlplane valid_lft forever preferred_lft forever inet6 fe80::2c2:aff:fe86:bf66/64 scope link valid_lft forever preferred_lft forever
First failure occurred due to missing steps (for undercloud with ssl) Note : if undercloud with ssl, run: - sudo cp cacert.pem /etc/pki/ca-trust/source/anchors/ - sudo update-ca-trust extract Second part of this bug, I believe (need to check on that), handled in: https://bugzilla.redhat.com/show_bug.cgi?id=1326644 Switching QA Contact to: Dan Yasny to verify it.
The fix for https://bugzilla.redhat.com/show_bug.cgi?id=1326644 will fix this issue as well
Verified with instack-undercloud-2.2.7-6.el7ost.noarch Upgraded 7.3GA to 8.0 worked : https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/E2E/view/rhos7-upgrade-on-BM/job/BM_rhos18_Upgrade_7.3_to_8.0_UCSSL_OCSSL/7/consoleFull
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1229