Bug 1632461
| Summary: | [OSP14] Overcloud stack update failed : "UPDATE aborted (Task update from TemplateResource "ControllerServiceChain" time out | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Artem Hrechanychenko <ahrechan> |
| Component: | openstack-tripleo-heat-templates | Assignee: | James Slagle <jslagle> |
| Status: | CLOSED DUPLICATE | QA Contact: | Gurenko Alex <agurenko> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 14.0 (Rocky) | CC: | ahrechan, athomas, jslagle, m.andre, mburns, ohochman, sbaubeau, scollier |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-10-16 21:22:12 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
also I got that issue when tried to configure fencing for overcloud nodes
2018-09-25 09:52:04Z [ControllerServiceChain]: UPDATE_FAILED UPDATE aborted (Task update from TemplateResource "ControllerServiceChain" [98d77a5e-5ac8-4526-be5e-295187ee647c] Stack "overcloud" [7ec2d8f7-d3d8-442e-b0b6-55159753f7d7] Timed out)
2018-09-25 09:52:04Z [overcloud-ControllerServiceChain-fydg2oycsm3d]: UPDATE_FAILED Stack UPDATE cancelled
2018-09-25 09:52:04Z [overcloud]: UPDATE_FAILED Timed out
2018-09-25 09:52:04Z [overcloud-ControllerServiceChain-fydg2oycsm3d-ServiceChain-6efl5grn3t7t]: UPDATE_FAILED Stack UPDATE cancelled
2018-09-25 09:52:06Z [overcloud-ControllerServiceChain-fydg2oycsm3d-ServiceChain-6efl5grn3t7t.1]: UPDATE_FAILED resources[1]: Stack UPDATE cancelled
2018-09-25 09:52:06Z [overcloud-ControllerServiceChain-fydg2oycsm3d-ServiceChain-6efl5grn3t7t]: UPDATE_FAILED Resource UPDATE failed: resources[1]: Stack UPDATE cancelled
Stack overcloud/7ec2d8f7-d3d8-442e-b0b6-55159753f7d7 UPDATE_FAILED
overcloud.ControllerServiceChain:
resource_type: OS::TripleO::Services
physical_resource_id: 98d77a5e-5ac8-4526-be5e-295187ee647c
status: UPDATE_FAILED
status_reason: |
UPDATE aborted (Task update from TemplateResource "ControllerServiceChain" [98d77a5e-5ac8-4526-be5e-295187ee647c] Stack "overcloud" [7ec2d8f7-d3d8-442e-b0b6-55159753f7d7] Timed out)
Heat Stack update failed.
Heat Stack update failed.
(undercloud) [stack@undercloud-0 ~]$ cat overcloud_deploy.sh
#!/bin/bash
openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /home/stack/virt/config_lvm.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/enable-tls.yaml \
-e /home/stack/virt/inject-trust-anchor.yaml \
-e /home/stack/virt/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e /home/stack/virt/hostnames.yml \
-e /home/stack/virt/nodes_data.yaml \
-e /home/stack/virt/extra_templates.yaml \
-e /home/stack/virt/docker-images.yaml \
-e /home/stack/fencing.yaml \
--log-file overcloud_deployment_14.log
(undercloud) [stack@undercloud-0 ~]$ cat fencing.yaml
parameter_defaults:
EnableFencing: true
FencingConfig:
devices:
- agent: fence_ipmilan
host_mac: 52:54:00:f1:1b:9c
params:
ipaddr: 172.16.0.1
ipport: '6234'
lanplus: true
login: admin
passwd: password
pcmk_host_list: compute-0
privlvl: administrator
- agent: fence_ipmilan
host_mac: 52:54:00:4d:07:a9
params:
ipaddr: 172.16.0.1
ipport: '6233'
lanplus: true
login: admin
passwd: password
pcmk_host_list: controller-2
privlvl: administrator
- agent: fence_ipmilan
host_mac: 52:54:00:c9:c5:6b
params:
ipaddr: 172.16.0.1
ipport: '6232'
lanplus: true
login: admin
passwd: password
pcmk_host_list: controller-1
privlvl: administrator
- agent: fence_ipmilan
host_mac: 52:54:00:3f:f2:81
params:
ipaddr: 172.16.0.1
ipport: '6230'
lanplus: true
login: admin
passwd: password
pcmk_host_list: controller-0
privlvl: administrator
please provide Heat logs from the undercloud I'm thinking this is probably a symptom of the same cause as https://bugzilla.redhat.com/show_bug.cgi?id=1629062 (In reply to James Slagle from comment #6) > please provide Heat logs from the undercloud sosreport: http://rhos-release.virt.bos.redhat.com/log/bz1632461 based on the error and the data we have, i'm marking this one a duplicate of bug 1629062. If you feel that is incorrect, and you are able to still reproduce the issue after increasing undercloud resources, please reopen it with that additional data. *** This bug has been marked as a duplicate of bug 1629062 *** |
Description of problem: (undercloud) [stack@undercloud-0 ~]$ openstack stack failures list overcloud overcloud.ControllerServiceChain: resource_type: OS::TripleO::Services physical_resource_id: d57f3fcf-ef85-4b04-8839-4f4712af5d80 status: UPDATE_FAILED status_reason: | UPDATE aborted (Task update from TemplateResource "ControllerServiceChain" [d57f3fcf-ef85-4b04-8839-4f4712af5d80] Stack "overcloud" [4199f63f-fa82-4814-89c4-e7f0a92298c5] Timed out) Controller replacement failed after executing overcloud deploy command with replace_controller.yaml (undercloud) [stack@undercloud-0 ~]$ cat overcloud_replace.sh #!/bin/bash openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ -e /home/stack/virt/config_lvm.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/enable-tls.yaml \ -e /home/stack/virt/inject-trust-anchor.yaml \ -e /home/stack/virt/public_vip.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \ -e /home/stack/virt/hostnames.yml \ -e /home/stack/virt/nodes_data.yaml \ -e /home/stack/virt/extra_templates.yaml \ -e /home/stack/virt/docker-images.yaml \ -e /home/stack/remove-controller.yaml \ --log-file overcloud_deployment_14.log (undercloud) [stack@undercloud-0 ~]$ cat remove-controller.yaml parameters: ControllerRemovalPolicies: [{'resource_list': ['1']}] parameter_defaults: CorosyncSettleTries: 5 step 11.4.3. Node Replacement from https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/director_installation_and_usage/#sect-Replacing_Controller_Nodes Version-Release number of selected component (if applicable): OSP14 puddle - 2018-09-06.1 openstack-tripleo-heat-templates-9.0.0-0.20180831204457.17bb71e.0rc1.el7ost.noarch openstack-tripleo-validations-9.3.1-0.20180831205305.fbfd253.el7ost.noarch python2-tripleo-common-9.3.1-0.20180831204016.bb0582a.el7ost.noarch python-tripleoclient-heat-installer-10.5.1-0.20180901082351.6d7aa74.el7ost.noarch openstack-tripleo-image-elements-9.0.0-0.20180831210308.2dc678a.el7ost.noarch ansible-role-tripleo-modify-image-1.0.1-0.20180903052248.40521ee.el7ost.noarch openstack-tripleo-heat-templates-9.0.0-0.20180831204457.17bb71e.0rc1.el7ost.noarch ansible-tripleo-ipsec-9.0.1-0.20180827143021.d2b9234.el7ost.noarch puppet-tripleo-9.3.1-0.20180831202649.8ec6c86.el7ost.noarch openstack-tripleo-common-9.3.1-0.20180831204016.bb0582a.el7ost.noarch python-tripleoclient-10.5.1-0.20180901082351.6d7aa74.el7ost.noarch openstack-tripleo-puppet-elements-9.0.0-0.20180831205939.0641fdc.el7ost.noarch openstack-tripleo-common-containers-9.3.1-0.20180831204016.bb0582a.el7ost.noarch How reproducible: Steps to Reproduce: 1. Deploy OSP14 overcloud with 3 controllers 2. Configure fencing 3. Corrupt controller node(corrupt disk) 4. Check that overcloud is operable 5. Try to replace controller using official documentation Actual results: UPDATE aborted (Task update from TemplateResource "ControllerServiceChain" [d57f3fcf-ef85-4b04-8839-4f4712af5d80] Stack "overcloud" [4199f63f-fa82-4814-89c4-e7f0a92298c5] Timed out) Expected results: Replacement failed on Controller.deployment stage Additional info: