Bug 1425894

Summary: OSP10 -> OSP11 upgrade fails during ControllerSwiftRingUpdate during major-upgrade-composable-steps.yaml
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-commonAssignee: Dougal Matthews <dmatthew>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 11.0 (Ocata)CC: aschultz, cschwede, dbecker, dtrainor, egafford, jschluet, mandreou, mburns, mcornea, morazi, rhel-osp-director-maint, scohen, slinaber
Target Milestone: rc   
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-6.0.0-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-17 20:01:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2017-02-22 17:14:28 UTC
Description of problem:
OSP10 -> OSP11 upgrade fails during ControllerSwiftRingUpdate:

cmd: source ~/stackrc; openstack stack failures list overcloud

start: 2017-02-22 08:54:11.970754

end: 2017-02-22 08:54:28.055764

delta: 0:00:16.085010

stdout: overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment:
  resource_type: OS::Heat::SoftwareDeployments
  physical_resource_id: 2bf68848-1764-427e-ae4c-ca821bde9250
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted
overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.ControllerSwiftRingUpdate.SwiftRingUpdate.1:
  resource_type: OS::Heat::SoftwareDeployment
  physical_resource_id: d024791b-d794-4d3a-8343-f7f5d5e94bb2
  status: CREATE_FAILED
  status_reason: |
    Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
  deploy_stdout: |
    ...
    /etc/swift/backups/1487766251.container.builder
    /etc/swift/backups/1487766251.object.builder
    /etc/swift/backups/1487766252.account.builder
    /etc/swift/backups/1487766255.account.builder
    /etc/swift/backups/1487766255.account.ring.gz
    /etc/swift/backups/1487766256.container.builder
    /etc/swift/backups/1487766256.container.ring.gz
    /etc/swift/backups/1487766256.object.builder
    /etc/swift/backups/1487766256.object.ring.gz
    /var/lib/heat-config/heat-config-script
    (truncated, view all with --long)
  deploy_stderr: |
    tar: Removing leading `/' from member names
overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.ControllerSwiftRingUpdate.SwiftRingUpdate.0:
  resource_type: OS::Heat::SoftwareDeployment
  physical_resource_id: bd9b1bfc-2ebd-45bb-ab66-4dc8e93f6eb6
  status: CREATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
  deploy_stdout: |
    ...
    /etc/swift/backups/1487766256.container.builder
    /etc/swift/backups/1487766256.object.builder
    /etc/swift/backups/1487766258.account.builder
    /etc/swift/backups/1487766265.account.builder
    /etc/swift/backups/1487766265.account.ring.gz
    /etc/swift/backups/1487766265.container.builder
    /etc/swift/backups/1487766265.container.ring.gz
    /etc/swift/backups/1487766266.object.builder
    /etc/swift/backups/1487766266.object.ring.gz
    /var/lib/heat-config/heat-config-script
    (truncated, view all with --long)
  deploy_stderr: |
    tar: Removing leading `/' from member names
overcloud.AllNodesDeploySteps.AllNodesPostUpgradeSteps.ControllerSwiftRingUpdate.SwiftRingUpdate.2:
  resource_type: OS::Heat::SoftwareDeployment
  physical_resource_id: 24a52114-2f47-4bff-af6e-03d2574e4069
  status: CREATE_FAILED
  status_reason: |
    Error: resources[2]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
  deploy_stdout: |
    ...
    /etc/swift/backups/1487766250.account.builder
    /etc/swift/backups/1487766250.container.builder
    /etc/swift/backups/1487766250.object.builder
    /etc/swift/backups/1487766254.account.builder
    /etc/swift/backups/1487766254.account.ring.gz
    /etc/swift/backups/1487766254.container.builder
    /etc/swift/backups/1487766254.container.ring.gz
    /etc/swift/backups/1487766255.object.builder
    /etc/swift/backups/1487766255.object.ring.gz
    /var/lib/heat-config/heat-config-script
    (truncated, view all with --long)
  deploy_stderr: |
    tar: Removing leading `/' from member names

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-6.0.0-0.20170218023452.edbaaa9.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Upgrade overcloud from OSP10 to OSP11

Actual results:
Upgrade fails during ControllerSwiftRingUpdate:

https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/swift-ring-update.yaml#L29

Expected results:
Upgrade is successful.

Additional info:

Comment 1 Marius Cornea 2017-02-22 17:18:41 UTC
Debugging shows that ${swift_ring_put_tempurl} is set to an empty value in https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/swift-ring-update.yaml#L29 so it exits in the if statement.

Comment 2 Marios Andreou 2017-02-23 11:33:13 UTC
Looking through git log I spotted this in tripleo-heat-templates https://review.openstack.org/#/c/414460/ which I suspect you have in the templates you're using (i.e. it is in latest puddle) but possibly you don't have the dependency from tripleo-common @ https://review.openstack.org/#/c/413229/ (but that merged three weeks ago so not sure)... I've got a reset env deploying at the moment, I'll be able to check against latest puddle

Comment 3 Christian Schwede (cschwede) 2017-02-27 12:02:29 UTC
Marius, which version of openstack-tripleo-common is installed on the undercloud? At least version openstack-tripleo-common-5.7.1-0.20170213225151.b2ca2e3.el7ost is required to generate the temporary URLs.

Comment 4 Marius Cornea 2017-02-27 16:03:02 UTC
(In reply to Christian Schwede (cschwede) from comment #3)
> Marius, which version of openstack-tripleo-common is installed on the
> undercloud? At least version
> openstack-tripleo-common-5.7.1-0.20170213225151.b2ca2e3.el7ost is required
> to generate the temporary URLs.

After the undercloud upgrade the installed version is openstack-tripleo-common-5.8.1-0.20170220145134.8ff75ce.el7ost.noarch

Comment 5 Christian Schwede (cschwede) 2017-02-28 13:10:17 UTC
Did the upgrade worked with that version?

I just did an update from Newton to Ocata, and this worked fine for me. I have these versions on my undercloud after upgrading from Newton to Ocata.

openstack-tripleo-heat-templates-6.0.0-0.20170228075057.ef0ce3e.el7.centos.noarch
openstack-tripleo-common-5.8.1-0.20170228071932.225951b.el7.centos.noarch

Comment 6 Marius Cornea 2017-02-28 15:14:50 UTC
(In reply to Christian Schwede (cschwede) from comment #5)
> Did the upgrade worked with that version?
> 
> I just did an update from Newton to Ocata, and this worked fine for me. I
> have these versions on my undercloud after upgrading from Newton to Ocata.
> 
> openstack-tripleo-heat-templates-6.0.0-0.20170228075057.ef0ce3e.el7.centos.
> noarch
> openstack-tripleo-common-5.8.1-0.20170228071932.225951b.el7.centos.noarch

No, it failed. I can constantly reproduce this error on my environment.

Comment 7 Christian Schwede (cschwede) 2017-03-01 18:11:38 UTC
I was able to reproduce & fix this.

Upstream bug: https://bugs.launchpad.net/tripleo/+bug/1669068
Upstream patch: https://review.openstack.org/#/c/439753/

Comment 22 errata-xmlrpc 2017-05-17 20:01:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245