Modification of OS::TripleO::NodeUserData causes stack fail with Failed to detach interface Installed stack Updated once with noop Modified firstboot-config.yaml cat firstboot-environment.yaml ~~~ resource_registry: OS::TripleO::NodeUserData: ./firstboot-config.yaml ~~~ Making change to firstboot-config.yaml ~~~ [stack@va-ps-undercloud ~]$ cp customer/cloud/templates/firstboot-config.yaml . [stack@va-ps-undercloud ~]$ vim customer/cloud/templates/firstboot-config.yaml [stack@va-ps-undercloud ~]$ diff customer/cloud/templates/firstboot-config.yaml firstboot-config.yaml 19c19 < echo "test" > /tmp/test --- > ~~~ This causes this, reproduced 3 times in a rows from a new stack with change to firstboot-config.yaml ~~~ 2016-10-14 23:57:25 [UserData]: CREATE_COMPLETE state changed 2016-10-14 23:57:25 [Controller]: UPDATE_IN_PROGRESS state changed 2016-10-14 23:57:25 [0]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (3de3a0aa-a8ad-4094-a8f9-675327e3238a) from server (bac0fee6-1bd3-411f-a382-aafd376cc660) 2016-10-14 23:57:25 [ps-Compute-35754rfd75fd-0-2wssq7lqgloo]: UPDATE_FAILED InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (3de3a0aa-a8ad-4094-a8f9-675327e3238a) from server (bac0fee6-1bd3-411f-a382-aafd376cc660) 2016-10-14 23:57:26 [ps-Compute-35754rfd75fd]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (3de3a0aa-a8ad-4094-a8f9-675327e3238a) from server (bac0fee6-1bd3-411f-a382-aafd376cc660) 2016-10-14 23:57:27 [Compute]: UPDATE_FAILED resources.Compute: resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (3de3a0aa-a8ad-4094-a8f9-675327e3238a) from server (bac0fee6-1bd3-411f-a382-aafd376cc660) 2016-10-14 23:57:31 [Controller]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (1850b140-aa43-4d5b-b66a-6bdee0af21d4) from server (d6bc7ee1-7eb7-4829-b0e6-04af7e6dbd97) 2016-10-14 23:57:31 [Controller]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (53d3cace-1a4d-4e2e-a6f6-2637e1d5f3e4) from server (c5fa5134-0b21-4ebb-83f6-c5deb141487b) 2016-10-14 23:57:32 [ps-Controller-c63telkbp5ji-1-tahvviu4l5fp]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (1850b140-aa43-4d5b-b66a-6bdee0af21d4) from server (d6bc7ee1-7eb7-4829-b0e6-04af7e6dbd97) 2016-10-14 23:57:32 [Controller]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (d8b188cf-93f2-4857-9621-bc3f9b385ac0) from server (d584a808-9cf6-4bb0-9d04-7da0392ad702) 2016-10-14 23:57:32 [ps-Controller-c63telkbp5ji-2-epxznfgwzaea]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (53d3cace-1a4d-4e2e-a6f6-2637e1d5f3e4) from server (c5fa5134-0b21-4ebb-83f6-c5deb141487b) 2016-10-14 23:57:33 [1]: UPDATE_FAILED resources[1]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (1850b140-aa43-4d5b-b66a-6bdee0af21d4) from server (d6bc7ee1-7eb7-4829-b0e6-04af7e6dbd97) 2016-10-14 23:57:33 [ps-Controller-c63telkbp5ji-0-dikg27uyth5r]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (d8b188cf-93f2-4857-9621-bc3f9b385ac0) from server (d584a808-9cf6-4bb0-9d04-7da0392ad702) 2016-10-14 23:57:34 [0]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (d8b188cf-93f2-4857-9621-bc3f9b385ac0) from server (d584a808-9cf6-4bb0-9d04-7da0392ad702) Stack ps UPDATE_FAILED [stack@va-ps-undercloud ~]$ ~~~ Everything for bug analysis in here (database and output of deployment): nova-interface-replacement.tar.gz
I can easily reproduce this in a lab: 1. deploy a normal stack ~~~ openstack overcloud deploy --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e ${template_base_dir}/network-environment.yaml \ --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage \ --control-scale $control_scale --compute-scale $compute_scale --ceph-storage-scale $ceph_scale \ --ntp-server $ntpserver \ --neutron-network-type vxlan --neutron-tunnel-types vxlan ~~~ 2. include a modification to OS::TripleO::NodeUserData: ~~~ openstack overcloud deploy --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e ${template_base_dir}/network-environment.yaml \ -e ${template_base_dir}/firstboot-environment.yaml \ --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage \ --control-scale $control_scale --compute-scale $compute_scale --ceph-storage-scale $ceph_scale \ --ntp-server $ntpserver \ --neutron-network-type vxlan --neutron-tunnel-types vxlan ~~~ ~~~ [stack@undercloud-6 ~]$ cat templates/firstboot- firstboot-config.yaml firstboot-environment.yaml [stack@undercloud-6 ~]$ cat templates/firstboot-config.yaml heat_template_version: 2014-10-16 parameters: resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: repo_config} repo_config: type: OS::Heat::SoftwareConfig properties: config: | #!/bin/bash sleep 10 echo "noop" outputs: OS::stack_id: value: {get_resource: userdata} ~~~ 3. kick off update ~~~ [stack@undercloud-6 ~]$ templates/deploy.sh control_scale=3, compute_scale=1, ceph_scale=0 1 nodes with profile compute won't be used for deployment now Configuration has 1 warnings, fix them before proceeding. Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates 2016-10-17 23:11:32 [overcloud]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:11:46 [overcloud-Networks-w5jwjlbjjlr3]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:11:48 [StorageNetwork]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:11:50 [overcloud-Networks-w5jwjlbjjlr3-StorageNetwork-nc2klovkxpww]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:11:51 [InternalNetwork]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:11:51 [overcloud-Networks-w5jwjlbjjlr3-TenantNetwork-gixvfbxykmuy]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:11:53 [ManagementNetwork]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:11:53 [overcloud-Networks-w5jwjlbjjlr3-InternalNetwork-gkil7uofefnk]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:11:55 [overcloud-VipConfig-xdcr4nbrewjx]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:11:56 [ExternalNetwork]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:11:56 [overcloud-Networks-w5jwjlbjjlr3-ManagementNetwork-aud4ruyessgs]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:11:58 [StorageMgmtNetwork]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:11:58 [overcloud-Networks-w5jwjlbjjlr3-StorageNetwork-nc2klovkxpww]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:11:58 [overcloud-Networks-w5jwjlbjjlr3-ExternalNetwork-dhfo5acolnas]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:00 [overcloud-Networks-w5jwjlbjjlr3-StorageMgmtNetwork-iivpmaywa4nt]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:03 [StorageNetwork]: UPDATE_COMPLETE state changed 2016-10-17 23:12:03 [overcloud-Networks-w5jwjlbjjlr3-TenantNetwork-gixvfbxykmuy]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:04 [TenantNetwork]: UPDATE_COMPLETE state changed 2016-10-17 23:12:04 [overcloud-Networks-w5jwjlbjjlr3-ManagementNetwork-aud4ruyessgs]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:04 [overcloud-Networks-w5jwjlbjjlr3-InternalNetwork-gkil7uofefnk]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:05 [overcloud-Networks-w5jwjlbjjlr3-ExternalNetwork-dhfo5acolnas]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:06 [ManagementNetwork]: UPDATE_COMPLETE state changed 2016-10-17 23:12:07 [InternalNetwork]: UPDATE_COMPLETE state changed 2016-10-17 23:12:07 [ExternalNetwork]: UPDATE_COMPLETE state changed 2016-10-17 23:12:09 [overcloud-Networks-w5jwjlbjjlr3-StorageMgmtNetwork-iivpmaywa4nt]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:10 [StorageMgmtNetwork]: UPDATE_COMPLETE state changed 2016-10-17 23:12:11 [overcloud-Networks-w5jwjlbjjlr3]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:13 [Networks]: UPDATE_COMPLETE state changed 2016-10-17 23:12:13 [ObjectStorage]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:13 [overcloud-ObjectStorage-qit5nnxujhwm]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:14 [RedisVirtualIP]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:15 [InternalApiVirtualIP]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:15 [overcloud-RedisVirtualIP-mki5ej4m4eox]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:15 [overcloud-ObjectStorage-qit5nnxujhwm]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:16 [CephStorage]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:16 [overcloud-InternalApiVirtualIP-suwmludfqkwx]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:20 [StorageVirtualIP]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:20 [overcloud-CephStorage-k6536r4kfe4r]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:22 [overcloud-StorageVirtualIP-czoyodfcnwvx]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:23 [overcloud-RedisVirtualIP-mki5ej4m4eox]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:23 [overcloud-PublicVirtualIP-rxgoa3bkzi4m]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:26 [overcloud-InternalApiVirtualIP-suwmludfqkwx]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:26 [overcloud-StorageMgmtVirtualIP-j65perdeasjp]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:30 [overcloud-StorageVirtualIP-czoyodfcnwvx]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:30 [overcloud-PublicVirtualIP-rxgoa3bkzi4m]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:32 [overcloud-StorageMgmtVirtualIP-j65perdeasjp]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:35 [overcloud-VipMap-76tyekfr6hke]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:36 [overcloud-VipMap-76tyekfr6hke]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:37 [VipMap]: UPDATE_COMPLETE state changed 2016-10-17 23:12:37 [EndpointMap]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:38 [overcloud-EndpointMap-wy54rkxej5qs]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:39 [overcloud-EndpointMap-wy54rkxej5qs]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:41 [EndpointMap]: UPDATE_COMPLETE state changed 2016-10-17 23:12:42 [Compute]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:45 [Controller]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:45 [overcloud-Compute-g23ja4zamk66]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:46 [0]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:52 [overcloud-Controller-fj2tcifo56dt]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:54 [overcloud-Compute-g23ja4zamk66-0-bi6ku6dsvk63]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:55 [overcloud-BlockStorage-zlco5cgbgj6r]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:12:56 [0]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:57 [overcloud-BlockStorage-zlco5cgbgj6r]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:12:58 [BlockStorage]: UPDATE_COMPLETE state changed 2016-10-17 23:12:59 [2]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:12:59 [overcloud-Controller-fj2tcifo56dt-0-2f345ha7zlez]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:13:02 [1]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:02 [overcloud-Controller-fj2tcifo56dt-2-6fommwyevitv]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:13:05 [overcloud-Controller-fj2tcifo56dt-1-5qz4xqn576qp]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:13:05 [UpdateConfig]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:06 [NodeUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:07 [NodeAdminUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:12 [UpdateConfig]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:12 [NodeUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:13 [NodeAdminUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:14 [NodeAdminUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:14 [UpdateConfig]: UPDATE_COMPLETE state changed 2016-10-17 23:13:15 [NodeUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:16 [UpdateConfig]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:18 [UpdateConfig]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:22 [NodeAdminUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:22 [UpdateConfig]: UPDATE_COMPLETE state changed 2016-10-17 23:13:22 [NodeAdminUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:23 [NodeUserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:26 [NodeAdminUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:29 [NodeAdminUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:31 [UpdateConfig]: UPDATE_COMPLETE state changed 2016-10-17 23:13:32 [UpdateConfig]: UPDATE_COMPLETE state changed 2016-10-17 23:13:34 [NodeAdminUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:34 [NodeUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:35 [UserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:38 [UserData]: CREATE_IN_PROGRESS state changed 2016-10-17 23:13:39 [UserData]: CREATE_COMPLETE state changed 2016-10-17 23:13:40 [NodeUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:40 [NovaCompute]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:41 [NodeUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:41 [UserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:42 [UserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:43 [UserData]: CREATE_IN_PROGRESS state changed 2016-10-17 23:13:43 [UserData]: CREATE_IN_PROGRESS state changed 2016-10-17 23:13:45 [UserData]: CREATE_COMPLETE state changed 2016-10-17 23:13:45 [UserData]: CREATE_COMPLETE state changed 2016-10-17 23:13:46 [NodeUserData]: UPDATE_COMPLETE state changed 2016-10-17 23:13:46 [Controller]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:46 [Controller]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:47 [UserData]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:13:47 [UserData]: CREATE_IN_PROGRESS state changed 2016-10-17 23:13:49 [UserData]: CREATE_COMPLETE state changed 2016-10-17 23:13:50 [NovaCompute]: UPDATE_FAILED InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (e761c030-8103-419d-b456-d986c3490205) from server (e33c674e-b9b5-4a7d-9276-c5530b51b411) 2016-10-17 23:13:51 [overcloud-Compute-g23ja4zamk66-0-bi6ku6dsvk63]: UPDATE_FAILED InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (e761c030-8103-419d-b456-d986c3490205) from server (e33c674e-b9b5-4a7d-9276-c5530b51b411) 2016-10-17 23:13:52 [0]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (e761c030-8103-419d-b456-d986c3490205) from server (e33c674e-b9b5-4a7d-9276-c5530b51b411) 2016-10-17 23:13:53 [overcloud-Compute-g23ja4zamk66]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (e761c030-8103-419d-b456-d986c3490205) from server (e33c674e-b9b5-4a7d-9276-c5530b51b411) 2016-10-17 23:13:54 [Compute]: UPDATE_FAILED resources.Compute: resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (e761c030-8103-419d-b456-d986c3490205) from server (e33c674e-b9b5-4a7d-9276-c5530b51b411) 2016-10-17 23:13:56 [Controller]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (5234a8af-1e88-4c73-9313-e8740512ee9a) from server (f946f9ec-9750-42f8-b9c4-08d7ef917ca0) 2016-10-17 23:13:56 [Controller]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (6e40122b-823c-4d50-96ab-86d83422a511) from server (97aae28a-7613-49c1-8eaf-b9c9eefcb713) 2016-10-17 23:13:57 [2]: UPDATE_FAILED resources[2]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (6e40122b-823c-4d50-96ab-86d83422a511) from server (97aae28a-7613-49c1-8eaf-b9c9eefcb713) 2016-10-17 23:13:57 [overcloud-Controller-fj2tcifo56dt-0-2f345ha7zlez]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (5234a8af-1e88-4c73-9313-e8740512ee9a) from server (f946f9ec-9750-42f8-b9c4-08d7ef917ca0) 2016-10-17 23:13:57 [overcloud-Controller-fj2tcifo56dt-2-6fommwyevitv]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (6e40122b-823c-4d50-96ab-86d83422a511) from server (97aae28a-7613-49c1-8eaf-b9c9eefcb713) 2016-10-17 23:13:59 [0]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (5234a8af-1e88-4c73-9313-e8740512ee9a) from server (f946f9ec-9750-42f8-b9c4-08d7ef917ca0) 2016-10-17 23:13:59 [Controller]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (25d90386-5bcf-455b-aea3-15887ed022a7) from server (e565db0a-0995-41c5-8bd6-cdcfeae24d88) 2016-10-17 23:14:00 [overcloud-Controller-fj2tcifo56dt-1-5qz4xqn576qp]: UPDATE_FAILED InterfaceDetachFailed: resources.Controller: Failed to detach interface (25d90386-5bcf-455b-aea3-15887ed022a7) from server (e565db0a-0995-41c5-8bd6-cdcfeae24d88) 2016-10-17 23:14:01 [1]: UPDATE_FAILED resources[1]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (25d90386-5bcf-455b-aea3-15887ed022a7) from server (e565db0a-0995-41c5-8bd6-cdcfeae24d88) 2016-10-17 23:14:02 [overcloud-Controller-fj2tcifo56dt]: UPDATE_FAILED resources[2]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (6e40122b-823c-4d50-96ab-86d83422a511) from server (97aae28a-7613-49c1-8eaf-b9c9eefcb713) 2016-10-17 23:14:03 [Controller]: UPDATE_FAILED resources.Controller: resources[2]: InterfaceDetachFailed: resources.Controller: Failed to detach interface (6e40122b-823c-4d50-96ab-86d83422a511) from server (97aae28a-7613-49c1-8eaf-b9c9eefcb713) Stack overcloud UPDATE_FAILED Deployment failed: Heat Stack update failed. ~~~
Note: So far, I cannot reproduce this with OSP 7.3 or OSP 8.
Note: In OSP 8, it seems that this is still trying to delete the nodes / or replace them, but something is able to stop this operation from happening: Output from OSP 8: ---------+---------------------------------------------------------------------------------------------------------------------------------------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | stack_name | +-----------------------------------------------+-----------------------------------------------+---------------------------------------------------------------------------------+--------------------+---------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+ | ControllerAllNodesDeployment | cbdd7f4d-2414-4ea8-abe3-7195197772e5 | OS::Heat::StructuredDeployments | UPDATE_IN_PROGRESS | 2016-10-17T23:23:37 | overcloud | | 2 | 50ec3911-7113-4649-8534-12b1e483de5e | OS::Heat::StructuredDeployment | UPDATE_IN_PROGRESS | 2016-10-17T23:23:38 | overcloud-ControllerAllNodesDeployment-wm55pgnpmabm | | 1 | f831366a-b7cb-4c4d-8f4b-7494ad20d442 | OS::Heat::StructuredDeployment | UPDATE_IN_PROGRESS | 2016-10-17T23:23:40 | overcloud-ControllerAllNodesDeployment-wm55pgnpmabm | | 0 | e7f7a31e-46f6-44ae-b669-ca508f450362 | OS::Heat::StructuredDeployment | UPDATE_IN_PROGRESS | 2016-10-17T23:23:41 | overcloud-ControllerAllNodesDeployment-wm55pgnpmabm | +-----------------------------------------------+-----------------------------------------------+---------------------------------------------------------------------------------+--------------------+--- And output from openstack overcloud deploy shows that lots of SIGNAL_COMPLETE happens, which I guess is basically a noop to avoid a node update / replacement: ~~~ 2016-10-17 23:23:49 [ControllerCephDeployment]: UPDATE_COMPLETE state changed 2016-10-17 23:23:49 [overcloud-ObjectStorageAllNodesValidationDeployment-374nfb74inhp]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:23:49 [CephStorageAllNodesValidationDeployment]: UPDATE_COMPLETE state changed 2016-10-17 23:23:49 [ObjectStorageAllNodesValidationDeployment]: UPDATE_COMPLETE state changed 2016-10-17 23:23:49 [ControllerCephDeployment]: UPDATE_COMPLETE state changed 2016-10-17 23:24:15 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:24:42 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:24:43 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:24:45 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:24:48 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:24:57 [UpdateDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:24:58 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:25:00 [0]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-10-17 23:25:01 [0]: UPDATE_COMPLETE state changed 2016-10-17 23:25:01 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:25:02 [NetworkDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:25:03 [overcloud-ComputeAllNodesDeployment-lx4sw5e44cjv]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:25:03 [NovaComputeDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:25:04 [overcloud-ComputeAllNodesValidationDeployment-tq7ayi432hzo]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:25:06 [overcloud-ComputeAllNodesValidationDeployment-tq7ayi432hzo]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:25:08 [ComputeAllNodesValidationDeployment]: UPDATE_COMPLETE state changed 2016-10-17 23:25:17 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:25:17 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:25:17 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:26:25 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:26:27 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:26:39 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:27:30 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:27:35 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:28:17 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:28:33 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:28:38 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:28:43 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:28:48 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:03 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:05 [UpdateDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:06 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:09 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:10 [UpdateDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:12 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:14 [1]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-10-17 23:29:15 [1]: UPDATE_COMPLETE state changed 2016-10-17 23:29:15 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:16 [1]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:17 [ControllerDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:20 [2]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-10-17 23:29:20 [2]: UPDATE_COMPLETE state changed 2016-10-17 23:29:22 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:23 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:24 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:25 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:25 [ControllerDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:26 [NetworkDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:29:26 [2]: SIGNAL_COMPLETE Unknown 2016-10-17 23:30:13 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:30:23 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:30:54 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:30:56 [UpdateDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:30:57 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:04 [0]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-10-17 23:31:05 [0]: UPDATE_COMPLETE state changed 2016-10-17 23:31:05 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:06 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:07 [ControllerDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:07 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:07 [overcloud-ControllerAllNodesDeployment-wm55pgnpmabm]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:31:08 [ControllerAllNodesDeployment]: UPDATE_COMPLETE state changed 2016-10-17 23:31:08 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:09 [ControllerAllNodesValidationDeployment]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:31:09 [overcloud-ControllerAllNodesValidationDeployment-cuoo7mhrmf4b]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:09 [NetworkDeployment]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:09 [0]: SIGNAL_COMPLETE Unknown 2016-10-17 23:31:11 [overcloud-ControllerAllNodesValidationDeployment-cuoo7mhrmf4b]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:31:13 [overcloud-AllNodesExtraConfig-vdfwb45yftaa]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:14 [overcloud-AllNodesExtraConfig-vdfwb45yftaa]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:31:15 [overcloud-ObjectStorageNodesPostDeployment-f3obu2om6j6n]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:16 [overcloud-ComputeNodesPostDeployment-5bxqyof3jcn5]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:17 [overcloud-ControllerNodesPostDeployment-6uto3nn3uqtx]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:19 [StorageArtifactsConfig]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:31:20 [overcloud-ObjectStorageNodesPostDeployment-f3obu2om6j6n-StorageArtifactsConfig-5pgr3q4x646i]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:22 [StorageArtifactsConfig]: UPDATE_COMPLETE state changed 2016-10-17 23:31:22 [StorageArtifactsDeploy]: UPDATE_IN_PROGRESS state changed 2016-10-17 23:31:22 [overcloud-ObjectStorageNodesPostDeployment-f3obu2om6j6n-StorageArtifactsConfig-5pgr3q4x646i]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:31:23 [overcloud-ObjectStorageNodesPostDeployment-f3obu2om6j6n-StorageArtifactsDeploy-ujdddlx6v3m2]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-10-17 23:31:23 [overcloud-ObjectStorageNodesPostDeployment-f3obu2om6j6n-StorageArtifactsDeploy-ujdddlx6v3m2]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-10-17 23:31:25 [StorageArtifactsDeploy]: UPDATE_COMPLETE state changed ~~~
Got bitten by this today, OSP9. Exact same output
I hit this issue too. I think this is caused by USER_DATA_UPDATE_POLICY parameter which is added in heat from OSP9. Add user_data_update_policy property to OS::Nova::Server https://review.openstack.org/#/c/274149/ A default value of this parameter is "REPLACE" and current template of OSP-Director doesn't have this parameter. So Heat tries to replace overcloud nodes when userdata changed. Once I added "user_data_update_policy" in appropriate templates[1], the problem goes away in my environment. [1] /usr/share/openstack-tripleo-heat-templates/puppet/controller.yaml /usr/share/openstack-tripleo-heat-templates/puppet/compute.yaml ------------------------------------------------------------------ type: OS::Nova::Server properties: image: {get_param: Image} image_update_policy: {get_param: ImageUpdatePolicy} flavor: {get_param: Flavor} key_name: {get_param: KeyName} networks: - network: ctlplane user_data_format: SOFTWARE_CONFIG user_data: {get_resource: UserData} name: str_replace: template: {get_param: Hostname} params: {get_param: HostnameMap} software_config_transport: {get_param: SoftwareConfigTransport} metadata: {get_param: ServerMetadata} scheduler_hints: {get_param: SchedulerHints} + user_data_update_policy: IGNORE ------------------------------------------------------------------ But to modify template is not recommended way, so please fix it.
The file /usr/lib/heat/undercloud_heat_plugins/server_update_allowed.py is intended to override the behaviour of OS::Nova::Server to never replace the server for any property changes. As a first step, could you please confirm that this file exists and that something like the following is logged to /var/log/heat/heat-engine.log: 187821:2016-12-21 22:08:20.159 23049 WARNING heat.engine.environment [-] Changing OS::Nova::Server from <class 'heat.engine.resources.openstack.nova.server.Server'> to <class 'heat.engine.plugins.undercloud_heat_plugins.server_update_allowed.ServerUpdateAllowed'> 187920:2016-12-21 22:08:20.186 23049 INFO heat.engine.environment [-] Registered: [Plugin](User:False) OS::Nova::Server -> <class 'heat.engine.plugins.undercloud_heat_plugins.server_update_allowed.ServerUpdateAllowed'>
I think these changes should be backport to OSP9 https://review.openstack.org/#/c/296578/ https://review.openstack.org/#/c/350778/ The method needs_replace_with_prop_diff is new to OSP9, so a modified version of the above would be needed for backporting to OSP8 or OSP7
Hi Steve, I can confirm that the plugin has been registered. 2016-12-21 20:42:43.129 4459 WARNING heat.engine.environment [-] Changing OS::Nova::Server from <class 'heat.engine.resources.openstack.nova.server.Server'> to <class 'heat.engine.plugins.undercloud_heat_plugins.server_update_allowed.ServerUpdateAllowed'> 2016-12-21 20:42:43.157 4459 INFO heat.engine.environment [-] Registered: [Plugin](User:False) OS::Nova::Server -> <class 'heat.engine.plugins.undercloud_heat_plugins.server_update_allowed.ServerUpdateAllowed'> I modified the server_update_allowed.py and restarted the heat-engine but still encountered the errors: 2016-12-22 01:49:55 [0]: UPDATE_FAILED resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (f81f857b-b232-4b9e-a84a-1120220cb553) from server (8ea40b41-4651-4d79-8e3b-91f843ab34ed) Best Regards, Chen
Created attachment 1234589 [details] Modified server_update_allowed.py
I'm assigning to heat for now to assist with triage.
FYI: The following worked in my test environment: * Step 1: Deploy overcloud (Initial) RESULT: CREATE_COMPLETE * Step 2: Re-Deploy overcloud with changed user_data: RESULT: UPDATE_FAILED UPDATE_FAILED resources.Compute: resources[0]: InterfaceDetachFailed: resources.NovaCompute: Failed to detach interface (8f9d37e6-7f98-4219-826a-9aafed8fee25) from server (fabdb31e-862d-4e61-8772-0a24a499f7c0) * Step 3: Manually set FAILED resource as COMPLETE: mysql heat update resource set status="COMPLETE" where status="FAILED" * Step 4: Re-Deploy with changed user_data: + "user_data_update_policy: IGNORE" workaround. RESULT: UPDATE_COMPLETE
(In reply to Steve Baker from comment #11) > I think these changes should be backport to OSP9 > > https://review.openstack.org/#/c/296578/ Yes, it looks like this what we're missing. I believe the analysis in comment #7 is correct - because Heat added a supported way to ignore the user data change through the template, the override in tripleo-common was not sufficient any more for templates that did not explicitly request to ignore changes. > The method needs_replace_with_prop_diff is new to OSP9, so a modified > version of the above would be needed for backporting to OSP8 or OSP7 I don't believe that's necessary, because the change that caused the regression (https://review.openstack.org/#/c/274149/) occurred in Mitaka. > https://review.openstack.org/#/c/350778/ That's not directly related to the problem here, but it would prevent a similar thing happening when an image is updated. (i.e. same image name but new UUID.) This was caused by a patch to Heat https://review.openstack.org/#/c/257904/ (https://bugs.launchpad.net/tripleo-common/+bug/1609020) - that was the Liberty backport of a Mitaka patch, but the problem was not corrected in tripleo-common until Newton: https://review.openstack.org/#/c/350778/ That change was never backported to Mitaka or Liberty upstream AFAICT. However, it *is* present in OSP 9 (see bug 1354627), so it needs only to be backported to OSP 8. I cloned it to OSP8 as bug 1409851.
How does needing to use the --force-update due to https://bugzilla.redhat.com/show_bug.cgi?id=1371580 for OSP 8 to OSP 9 Upgrade.
Couple of quesiton we need to understand: 1. For a fully deployed OSP 8 solution what is the correct patch to Upgarde to OSP 9? 2. For a new deployment of OSP 8 what is the correct patch?
1. For an OSP 9 undercloud (as you'd use to upgrade an OSP 8 overcloud to OSP 9) you want the patch in this bug, https://review.openstack.org/#/c/296578/ (the patch for bug 1409851, which you'll also want, is already present... in fact you'll get a conflict applying the patch directly from upstream because they're out of order). 2. For an OSP 8 undercloud, you just want the patch from bug 1409851, https://review.openstack.org/#/c/350778/
How does needing to use the --force-update due to the mentioned bug affect this patch? (In reply to Randy Perryman from comment #18) > How does needing to use the --force-update due to > https://bugzilla.redhat.com/show_bug.cgi?id=1371580 for OSP 8 to OSP 9 > Upgrade.
(In reply to Randy Perryman from comment #18) > How does needing to use the --force-update due to > https://bugzilla.redhat.com/show_bug.cgi?id=1371580 for OSP 8 to OSP 9 > Upgrade. You mean --force-postconfig? That bug appears to be completely unrelated to this one as far as I can tell.
I am trying to understand the interactions that will occur when the this bug is fixed and that bug is implemented. You are saying there absolutely none?
(In reply to Randy Perryman from comment #24) > I am trying to understand the interactions that will occur when the this bug > is fixed and that bug is implemented. You are saying there absolutely none? Correct. AFAICT OS::TripleO::NodeUserData is only used to create an admin user on the nodes, and for operator-specific customisations. --force-postconfig presumably updates the PostConfig resources even if their inputs haven't changed, but that's not used as an input to NodeUserData. So they're operating on independent resources. You probably want both patches, but they don't appear to interact.
Created attachment 1247724 [details] Output from running the deploy command
Environment: openstack-tripleo-common-2.0.0-9.el7ost.noarch The reported issue doesn't reproduce: Verified with the following steps: Deployed overcloud with: openstack overcloud deploy --debug --templates --libvirt-type kvm --ntp-server clock.redhat.com --neutron-network-type vxlan --neutron-tunnel-types vxlan --control-scale 3 --control-flavor controller-d75f3dec-c770-5f88-9d4c-3fea1bf9c484 --compute-scale 2 --compute-flavor compute-b634c10a-570f-59ba-bdbf-0c313d745a10 --ceph-storage-scale 3 --ceph-storage-flavor ceph-cf1f074b-dadb-5eb8-9eb0-55828273fab7 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e virt/ceph.yaml -e virt/hostnames.yml -e virt/network/network-environment.yaml --log-file overcloud_deployment_48.log Created firstboot-config.yaml: [stack@undercloud-0 ~]$ cat firstboot-config.yaml heat_template_version: 2014-10-16 parameters: resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: repo_config} repo_config: type: OS::Heat::SoftwareConfig properties: config: | #!/bin/bash sleep 10 echo "noop" outputs: OS::stack_id: value: {get_resource: userdata} Created firstboot-environment.yaml: [stack@undercloud-0 ~]$ cat firstboot-environment.yaml resource_registry: OS::TripleO::NodeUserData: /home/stack/firstboot-config.yaml Ran the new command: openstack overcloud deploy --templates --libvirt-type kvm --ntp-server clock.redhat.com --neutron-network-type vxlan --neutron-tunnel-types vxlan --control-scale 3 --control-flavor controller-d75f3dec-c770-5f88-9d4c-3fea1bf9c484 --compute-scale 2 --compute-flavor compute-b634c10a-570f-59ba-bdbf-0c313d745a10 --ceph-storage-scale 3 --ceph-storage-flavor ceph-cf1f074b-dadb-5eb8-9eb0-55828273fab7 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e virt/ceph.yaml -e virt/hostnames.yml -e virt/network/network-environment.yaml -e firstboot-environment.yaml Completed with: Stack overcloud UPDATE_COMPLETE Overcloud Endpoint: http://10.0.0.101:5000/v2.0 Overcloud Deployed The depoyment output is in the attached file "Output from running the deploy command".
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0470.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days