Bug 1361877

Summary: upgrade failed: Failed to detach interface
Product: Red Hat OpenStack Reporter: Ronnie Rasouli <rrasouli>
Component: rhosp-directorAssignee: Dan Sneddon <dsneddon>
Status: CLOSED WORKSFORME QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: dbecker, dmaley, eglynn, jason.dobies, jjoyce, jraju, mburns, morazi, rhel-osp-director-maint, rrasouli, sbaker, tvignaud
Target Milestone: ga   
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-03 20:55:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
error logs where the error found none

Description Ronnie Rasouli 2016-07-31 12:28:20 UTC
Created attachment 1186053 [details]
error logs where the error found

Description of problem:

Overcloud update failed with error Failed to detach interface

Version-Release number of selected component (if applicable):
openstack-heat-api-6.0.0-8.el7ost.noarch
openstack-heat-api-cfn-6.0.0-8.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-24.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-16.el7ost.noarch

How reproducible:


Steps to Reproduce:
1.deploy rhos8 
2.launch an instance from network
3.update the undercloud
4. update the overcloud

Actual results:

overcloud_update failed

Expected results:

Upgrade success

Additional info:
based on

Deployment rhos8 overcloud
openstack overcloud deploy --templates ~/templates/my-overcloud --control-scale 3 --compute-scale 1 --ntp-server clock.redhat.com --libvirt-type qemu -e ~/templates/my-overcloud/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e ~/templates/firstboot-environment.yaml

update 
spawn instancetnet=$(neutron net-list | grep tenant-net | awk '{print $2};')
nova boot --image cirros --flavor m1.tiny firstInstance --nic net-id=$tnet
openstack overcloud deploy --templates ~/templates/my-overcloud --control-scale 3 --compute-scale 1 --ntp-server clock.redhat.com --libvirt-type qemu -e ~/templates/my-overcloud/environments/network-isolation.yaml -e ~/templates/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-aodh.yaml  --force-postconfig

Comment 6 Steve Baker 2016-08-02 23:45:19 UTC
According to the attached log, multiple compute and controller nodes are getting the detach error.

This suggests that some change is causing port resources to be replaced. Chances are this is because other network resources are being replaced unexpectedly - this should be the first thing to check.

Attaching an event list for the stack update should show what network resources are being replaced:

  heat event-list --nested-depth 3 --format log overcloud

If those replacements are unexpected, then you'll need to figure out what properties are changing which cause replacement.

If those replacements are expected and these nodes really do need new ports attached to them, then you'll need to diagnose nova for the cause of the detach failures.

I'll have a look in upstream logstash to see if https://bugs.launchpad.net/heat/+bug/1585858 might be related

Comment 7 Steve Baker 2016-08-02 23:57:55 UTC
I'm not seeing any detach errors in upstream gate jobs

Comment 9 Mike Burns 2016-08-03 20:55:18 UTC
In an IRC conversation, it was brought up that there was a mistake in the process (the templates were copied to $HOME but not updated with the newer templates).  Given that others have not seen this issue so far, we're going to close the bug.  

If it reproduces, please reopen the bug.