Bug 1815518

Summary: S3 Bucket versioning blocks cluster uninstall
Product: OpenShift Container Platform Reporter: James Harrington <jaharrin>
Component: InstallerAssignee: John Hixson <jhixson>
Installer sub component: openshift-installer QA Contact: Yunfei Jiang <yunjiang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: adahiya, dkulkarn, yunjiang
Version: 4.3.0   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-04 18:06:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description James Harrington 2020-03-20 13:48:41 UTC
Description of problem:
Uninstall fails because the uninstaller cannot remove a bucket

Version-Release number of selected component (if applicable):
4.3 release - SHA 637eaddb8031a33c8b95b667bc28bb0457007c2f54ab9aaeb0a7fe36d1eb4ea9

How reproducible:
Everytime S3 bucket versioning is enalbed for a S3 bucket

Steps to Reproduce:
1. Enable s3 bucket versioning
  $ aws s3api put-bucket-versioning --bucket jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh --versioning-configuration Status=Enabled

2. Verify its enabled
  $ aws s3api get-bucket-versioning --bucket jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh
{
    "Status": "Enabled"
}


3. Uninstall cluster


Actual results:
Uninstaller stuck with the following repeating log:
time="2020-03-20T03:57:50Z" level=debug msg="search for and delete matching resources by tag in us-east-1 matching aws.Filter{\"kubernetes.io/cluster/jamesh-s3-test-lrlmx\":\"owned\"}"
time="2020-03-20T03:57:50Z" level=debug msg=Emptied arn="arn:aws:s3:::jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh"
time="2020-03-20T03:57:50Z" level=debug msg="BucketNotEmpty: The bucket you tried to delete is not empty. You must delete all versions in the bucket.\n\tstatus code: 409, request id: 8EF8C97012330ED9, host id: luF20s0e7FchclBEkox5nWq+g810BUByadoTIgitTHIm4M4lbpdxwQvocBRTPBZTs9tS35ZW+cw=" arn="arn:aws:s3:::jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh"

Expected results:
Uninstall completes successfully

time="2020-03-20T13:16:20Z" level=debug msg="search for and delete matching resources by tag in us-east-1 matching aws.Filter{\"kubernetes.io/cluster/jamesh-s3-test-lrlmx\":\"owned\"}"                                                 
time="2020-03-20T13:16:20Z" level=debug msg=Emptied arn="arn:aws:s3:::jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh"
time="2020-03-20T13:16:21Z" level=info msg=Deleted arn="arn:aws:s3:::jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh"


Additional info:

AWS S3 bucket versioning requires that all versions of keys in the bucket are removed before you can delete it.

This bucket was emptied successfully but versions of keys were not deleted therefore blocking the deletion of the bucket.

Bucket policies allow you to enable bucket versioning by default on any newly created bucket, this might happen in an AWS account where AWS Organizations are enabled and it is policy.

If we list and delete the versioned object the uninstall works, no need to disable bucket versioning.

aws s3api list-object-versions --bucket=jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh
{
    "DeleteMarkers": [
        {
            "Owner": {
                "DisplayName": "osd-creds-mgmt+t5674x", 
                "ID": "8ab6b1c67dbef2d9829816e829fb5e1c987ceb6c55787ba5a60ea7b8859ef983"
            }, 
            "IsLatest": true, 
            "VersionId": "8kdjikBdsmRbDmquG9MfDeE3qKoBU4QT", 
            "Key": "test.txt", 
            "LastModified": "2020-03-20T03:57:07.000Z"
        }
    ]
}


aws s3api delete-object --version-id 8kdjikBdsmRbDmquG9MfDeE3qKoBU4QT --key test.txt --bucket jamesh-s3-test-lrlmx-image-registry-us-east-1-vsqehbsnypygumfh

Comment 1 Abhinav Dahiya 2020-03-20 15:34:25 UTC
previously closed bug with insufficient data https://bugzilla.redhat.com/show_bug.cgi?id=1809764

Comment 2 Abhinav Dahiya 2020-03-20 15:34:56 UTC
previously closed bug with insufficient data https://bugzilla.redhat.com/show_bug.cgi?id=1809764

Comment 3 John Hixson 2020-04-01 04:19:55 UTC
I've written a small driver program to reproduce this problem. I am able to reproduce it at this point. I should have a fix soon.

Comment 4 John Hixson 2020-04-02 01:18:20 UTC
PR: https://github.com/openshift/installer/pull/3393

Comment 7 Yunfei Jiang 2020-04-07 04:11:26 UTC
verified on 4.5.0-0.nightly-2020-04-06-234745

time="2020-04-06T23:45:52-04:00" level=debug msg=Emptied arn="arn:aws:s3:::yunjiang-0407-4q2fr-image-registry-us-east-2-budwascofsuolwxyg"
time="2020-04-06T23:45:52-04:00" level=debug msg="Versions Deleted" arn="arn:aws:s3:::yunjiang-0407-4q2fr-image-registry-us-east-2-budwascofsuolwxyg"
time="2020-04-06T23:45:52-04:00" level=info msg=Deleted arn="arn:aws:s3:::yunjiang-0407-4q2fr-image-registry-us-east-2-budwascofsuolwxyg"

Cluster destroyed successfully.

Comment 9 errata-xmlrpc 2020-08-04 18:06:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409