Description of problem: OSP14- The overcloud update failed with following error: UPDATE_FAILED Error in ControllerServiceChain output role_data: Collection length exceeds 1000 elements This environment is rather small with 1 controller, and 3 HCI modes This might be a red herring but I have anticipated to add octavia and sahara to my current deployment Version-Release number of selected component (if applicable): OSP14 How reproducible: Steps to Reproduce: 1. Deploy OSP14 .. and have it running for few weeks 2. Attempt to update with new additions (octavia and sahara in this case) 3. Actual results: Failed with error Expected results: Succeeds with new functionality enabled Additional info: Logs are too big to attach to this BZ so I am making it available for download here: http://chrisj.cloud/undercloud-logs.tar.gz
I have tried redeploying with just: -e /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml and it completed successfully, but as soon as I add -e /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml, my deployment fails. My deployment script is a follows (without octavia): (undercloud) [stack@undercloud-osp14 ~]$ cat deploy.sh #!/bin/bash source ~/stackrc cd ~/ time openstack overcloud deploy --templates --stack chrisj-osp14 \ -r /home/stack/templates/roles_data.yaml \ -n /home/stack/templates/network_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-dns.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ml2-ansible.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml \ -e /home/stack/templates/network-environment.yaml \ -e /home/stack/templates/ceph-custom-config.yaml \ -e /home/stack/templates/enable-tls.yaml \ -e /home/stack/templates/enable-ldap.yaml \ -e /home/stack/templates/ExtraConfig.yaml \ -e /home/stack/templates/inject-trust-anchor.yaml \ -e /home/stack/templates/inject-trust-anchor-hiera.yaml \ -e /home/stack/templates/containers-prepare-parameter.yaml
I was able to work this around by bumping up: [yaql] limit_iterators=2000 in: /var/lib/config-data/puppet-generated/heat/etc/heat/heat.conf .. and bouncing the heat. It looks like we have changed this default before: https://bugzilla.redhat.com/show_bug.cgi?id=1395740 It looks like we might need more in OSP14
VERIFIED openstack-tripleo-heat-templates-9.3.1-0.20190314162753.d0a6cb1.el7ost.noarch 2019-04-05.1 (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list |grep octavia | 605a0c45ab8940bab03a62dc8cfd61a8 | regionOne | octavia | load-balancer | True | internal | http://172.17.1.11:9876 | | c0ebe9f0aeb945a6b6e130c14bc4c90c | regionOne | octavia | load-balancer | True | public | https://10.0.0.101:13876 | | f3d4e31745b24f8a92baa06ceca96048 | regionOne | octavia | load-balancer | True | admin | http://172.17.1.11:9876 | (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list |grep sahara | 3c49c8229d134ac9ac1f99128bc56a2d | regionOne | sahara | data-processing | True | admin | http://172.17.1.11:8386/v1.1/%(tenant_id)s | | 5c3dcfb5c52c4cd88299668b1427e6f8 | regionOne | sahara | data-processing | True | public | https://10.0.0.101:13386/v1.1/%(tenant_id)s | | ca3fc26c0f7742af9d962d7e1b4477b1 | regionOne | sahara | data-processing | True | internal | http://172.17.1.11:8386/v1.1/%(tenant_id)s | (overcloud) [stack@undercloud-0 ~]$ cat overcloud_deploy.sh #!/bin/bash openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ -e /home/stack/virt/internal.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/enable-tls.yaml \ -e /home/stack/virt/inject-trust-anchor.yaml \ -e /home/stack/virt/public_vip.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \ -e /home/stack/virt/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e ~/containers-prepare-parameter.yaml \ -e /home/stack/virt/extra_templates.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \ --log-file overcloud_deployment_94.log
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0878