Bug 1717604

Summary: buildcop: failures in e2e-aws container setup related to \"aws_lb\" \"api_external\"
Product: OpenShift Container Platform Reporter: Kirsten Garrison <kgarriso>
Component: InstallerAssignee: W. Trevor King <wking>
Installer sub component: openshift-installer QA Contact: sheng.lao <shlao>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: high CC: adahiya, aos-bugs, bparees, jokerman, krizza, mmccomas, nhale, openshift-bugzilla-robot, rmarasch, sdodson, wking
Version: 4.2.0   
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1749624 (view as bug list) Environment:
Last Closed: 2019-10-16 06:31:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1749624    
Attachments:
Description Flags
List slow CI ELBs from the past 24 hours none

Description Kirsten Garrison 2019-06-05 19:15:15 UTC
Description of problem:

Seeing failures similar to:
Installing from release registry.svc.ci.openshift.org/ci-op-n9cc84cd/release@sha256:1a22bc4527c3d0d33811bbcacd5565643321f49f30c3a3fabbb1381a9fe2e30a
level=warning msg="Found override for ReleaseImage. Please be warned, this is not advised"
level=info msg="Consuming \"Install Config\" from target directory"
level=info msg="Creating infrastructure resources..."
level=error
level=error msg="Error: timeout while waiting for state to become 'active' (last state: 'provisioning', timeout: 10m0s)"
level=error
level=error msg="  on ../tmp/openshift-install-375369954/vpc/master-elb.tf line 19, in resource \"aws_lb\" \"api_external\":"
level=error msg="  19: resource \"aws_lb\" \"api_external\" {"
level=error
level=error
level=fatal msg="failed to fetch Cluster: failed to generate asset \"Cluster\": failed to create cluster: failed to apply using Terraform"
Version-Release number of selected component (if applicable):

in a few PRs:
https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/240/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws/1145

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_console/1674/pull-ci-openshift-console-master-e2e-aws/2032

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_console/1674/pull-ci-openshift-console-master-e2e-aws/2027

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/23026/pull-ci-openshift-origin-master-e2e-aws/9465

Comment 1 Kirsten Garrison 2019-06-05 20:44:11 UTC
Note: This failure accounts for 71 out of 79 failures that contain: level=error.*timeout while waiting for state.*

The occurrences have started appearing consistently in the last 6 hours.

Comment 3 Kirsten Garrison 2019-06-05 23:19:29 UTC
Also hitting e2e-aws-upgrade tests
https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.2/40
https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cluster-kube-scheduler-operator/140/pull-ci-openshift-cluster-kube-scheduler-operator-master-e2e-aws/529

We were investigating a big and found that these errors started appearing before we were anywhere near our peak aws usage & we aren't seeing anything in our account-specific alerts on aws either.

Comment 4 Nick Hale 2019-07-12 03:10:01 UTC
Was seeing this again during a usage spike today:

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/3831

Comment 5 OpenShift BugZilla Robot 2019-07-12 18:09:02 UTC
Abhinav, can we explicitly define the number of clusters we support concurrently in AWS?

Comment 6 Ben Parees 2019-07-12 20:49:13 UTC
This needs attention.  It may be pointing to a product bug in which something is exhausting our api budget.

Comment 7 Ben Parees 2019-07-12 20:53:56 UTC
*** Bug 1729525 has been marked as a duplicate of this bug. ***

Comment 10 Steve Kuznetsov 2019-07-24 15:31:59 UTC
Seeing this again today during EDT morning, at 00:00 when we had ~40 concurrent clusters. Unlikely it was load.

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.2/2381#0:build-log.txt%3A26

Comment 11 W. Trevor King 2019-08-27 23:11:03 UTC
So for load balancers this is not necessarily API throttling, although it may be correlated.  Checking a recent instance [1] for a slow ci-op-8zpt2ntt-257f3-pj6mj-int/2b37a3db41c66e55:

time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: 2019/08/27 21:11:20 [DEBUG] [aws-sdk-go] DEBUG: Response elasticloadbalancing/CreateLoadBalancer Detai
ls:"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: ---[ RESPONSE ]--------------------------------------"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: HTTP/1.1 200 OK"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: Connection: close"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: Content-Length: 1533"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: Content-Type: text/xml"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: Date: Tue, 27 Aug 2019 21:11:19 GMT"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: X-Amzn-Requestid: 371cfdea-c90f-11e9-a722-bfeb2f807c8c"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: "
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: "
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: -----------------------------------------------------"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws: 2019/08/27 21:11:20 [DEBUG] [aws-sdk-go] <CreateLoadBalancerResponse xmlns=\"http://elasticloadbalanci
ng.amazonaws.com/doc/2015-12-01/\">"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:   <CreateLoadBalancerResult>"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:     <LoadBalancers>"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:       <member>"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:         <LoadBalancerArn>arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-8zpt2ntt-257f3-pj6mj-int/2b37a3db41c66e55</LoadBalancerArn>"
...
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:         <CreatedTime>2019-08-27T21:11:20.120Z</CreatedTime>"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:         <State>"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:           <Code>provisioning</Code>"
time="2019-08-27T21:11:20Z" level=debug msg="2019-08-27T21:11:20.166Z [DEBUG] plugin.terraform-provider-aws:         </State>"

For the next 10 minutes we keep polling:

time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: 2019/08/27 21:21:16 [DEBUG] [aws-sdk-go] DEBUG: Response elasticloadbalancing/DescribeLoadBalancers De
tails:"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: ---[ RESPONSE ]--------------------------------------"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: HTTP/1.1 200 OK"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: Connection: close"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: Content-Length: 1545"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: Content-Type: text/xml"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: Date: Tue, 27 Aug 2019 21:21:15 GMT"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: X-Amzn-Requestid: 9a97ccad-c910-11e9-9bbb-e3805984d5f1"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: "
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: "
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: -----------------------------------------------------"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws: 2019/08/27 21:21:16 [DEBUG] [aws-sdk-go] <DescribeLoadBalancersResponse xmlns=\"http://elasticloadbala
ncing.amazonaws.com/doc/2015-12-01/\">"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:   <DescribeLoadBalancersResult>"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:     <LoadBalancers>"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:       <member>"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:         <LoadBalancerArn>arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-8zpt2ntt-257f3-pj6mj-int/2b37a3db41c66e55</LoadBalancerArn>"
...
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:         <CreatedTime>2019-08-27T21:11:20.120Z</CreatedTime>"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:         <State>"
time="2019-08-27T21:21:16Z" level=debug msg="2019-08-27T21:21:16.223Z [DEBUG] plugin.terraform-provider-aws:           <Code>provisioning</Code>"

before giving up:

time="2019-08-27T21:21:20Z" level=error msg="1 error occurred:"
time="2019-08-27T21:21:20Z" level=error msg="\t* module.vpc.aws_lb.api_internal: 1 error occurred:"
time="2019-08-27T21:21:20Z" level=error msg="\t* aws_lb.api_internal: timeout while waiting for state to become 'active' (last state: 'provisioning', timeout: 10m0s)"

We can monitor the flow of provisioning NAT load balancers with:

$ $ AWS_PROFILE=ci aws elbv2 describe-load-balancers --output json | jq -r '[.LoadBalancers[] | .age = now - (.CreatedTime | sub("[.][0-9]*"; "") | strptime("%Y-%m-%dT%H:%M:%SZ") | mktime)] | sort_by(.age)[] | select(.age < 600) | (.age / 60 * 100 | floor | . / 100 | tostring) + "m " + .State.Code'
5.29m active

And maybe polling that to see how long things are taking to provision?

[1]: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-nightly/297/artifacts/e2e-aws-upgrade/installer/.openshift_install.log

Comment 12 W. Trevor King 2019-08-27 23:14:44 UTC
Linking the upstream issue (about application load balancers, but it's probably the same backing code as we use for network load balancers, since both are v2 load balancers).  And here's a more interesting current state:

$ AWS_PROFILE=ci aws elbv2 describe-load-balancers --output json | jq -r '[.LoadBalancers[] | .age = now - (.CreatedTime | sub("[.][0-9]*"; "") | strptime("%Y-%m-%dT%H:%M:%SZ") | mktime)] | sort_by(.age)[] | select(.age < 600) | (.age / 60 * 100 | floor | . / 100 | tostring) + "m " + .State.Code + " " + .LoadBalancerArn'
0.91m provisioning arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-wrpzgtib-ab887-6xxbz-int/141889bbcf589db2
0.91m provisioning arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-wrpzgtib-ab887-6xxbz-ext/48ed5585bbe2e456
3.79m active arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-j07q0cw1-1d3f3-r5l88-int/880ec8994f6d2ad1
3.79m active arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-j07q0cw1-1d3f3-r5l88-ext/c8fd1d4a6c514df6
4.33m active arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-j07q0cw1-76f5e-9qw6x-ext/aa9943e4845c0eb7
4.34m active arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-j07q0cw1-76f5e-9qw6x-int/66a16b033e055941
9.88m active arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/a0b88f8ddc91f11e9a0f112059dc191d/b6c50ecd9f73f930

Comment 13 W. Trevor King 2019-08-27 23:56:27 UTC
Grabbing the day's installer logs with:

--- a/deck-build-log/deck-build-log-pull
+++ b/deck-build-log/deck-build-log-pull
@@ -30 +30 @@ sync() {
-                       for DETAIL in #clusteroperators.json events.json nodes.json pods.json
+                       for DETAIL in installer/.openshift_install.log #clusteroperators.json events.json nodes.json pods.json
@@ -31,0 +32 @@ sync() {
+                               DETAIL_BASE="$(basename "${DETAIL}")" &&
@@ -33 +34 @@ sync() {
-                               DETAIL_PATH="${BUILD_LOG_PATH/build-log.txt/}${DETAIL}" &&
+                               DETAIL_PATH="${BUILD_LOG_PATH/build-log.txt/}${DETAIL_BASE}" &&

applied to [1].  It's still sucking down data, but looking at the content I've already downloaded and binning creation times with floor(minute) granularity:

$ grep -r '\(aws_lb\..*Creation complete after \|debug.*aws_lb\..*timeout while waiting for state\)' ~/.cache/openshift-deck-build-logs | sort | uniq | sed 's/.*Creation complete after \([0-9]*\)m.*/\1/;s/.*timeout: 10m0s.*/10/' | sort -n | uniq -c
    705 2
    206 3
      2 4
     60 5
     64 6
     33 7
     26 8
     15 9
     59 10

I'm not sure how long the tail would have been without the current 10m cap.

[1]: https://github.com/wking/openshift-release/tree/debug-scripts/deck-build-log

Comment 15 W. Trevor King 2019-08-28 01:16:35 UTC
Log-slurping complete.  Final stats are:

  $ grep -r '\(aws_lb\..*Creation complete after \|debug.*aws_lb\..*timeout while waiting for state\)' ~/.cache/openshift-deck-build-logs | sort | uniq | sed 's/.*Creation complete after \([0-9]*\)m.*/\1/;s/.*timeout: 10m0s.*/10/' | sort -n | uniq -c
   1584 2
    465 3
      5 4
    106 5
     86 6
     46 7
     37 8
     19 9
     83 10
      1 19

19m case is from [1]:

  $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_openshift-controller-manager/18/pull-ci-openshift-openshift-controller-manager-master-e2e-aws-builds/16/artifacts/e2e-aws-builds/installer/.openshift_install.log | grep '\.aws_lb\..*after 19m'
time="2019-08-20T16:34:29Z" level=debug msg="module.vpc.aws_lb.api_external: Creation complete after 19m6s [id=arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-xdts8ybz-5894a-7j67p-ext/e379e829580fd4cd]"

Dunno what was going on there, but it suggests that if we do end up bumping the timeout, we should bump it to 20+m (although by that point we're pushing our 30m cap for all infrastructure creation).  Total number of creation attempts:

  $ grep -r '\(aws_lb\..*Creation complete after \|debug.*aws_lb\..*timeout while waiting for state\)' ~/.cache/openshift-deck-build-logs | sort | uniq | sed 's/.*Creation complete after \([0-9]*\)m.*/\1/;s/.*timeout: 10m0s.*/10/' | wc -l
  2432

So the rate of 10+m timeouts is 83 / 2432 = 3.4% for the ~24hours covered by my deck-build-log-pull slurp.  I'm following up with AWS to see if they think that's a reasonable percentage.

[1]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_openshift-controller-manager/18/pull-ci-openshift-openshift-controller-manager-master-e2e-aws-builds/16

Comment 20 W. Trevor King 2019-08-29 04:24:04 UTC
Heard back from AWS:

On Wed, Aug 28, 2019 at 8:40 PM Jack S. wrote:
> While we do not have SLA's regarding the time to provision a Network Load Balancer, in my experience NLB's take less than 10 minutes to provision. Our SLA's are around Monthly Uptime Percentage [1].
> ...
> [1] https://aws.amazon.com/elasticloadbalancing/sla/

They're continuing to dig into our higher incidence of 10+m provisioning times, so we will hopefully have more details on why this suddenly flared up again soon.

Comment 22 W. Trevor King 2019-08-29 17:16:10 UTC
*** Bug 1746958 has been marked as a duplicate of this bug. ***

Comment 23 W. Trevor King 2019-08-29 18:01:05 UTC
From AWS:

On Wed, Aug 28, 2019 at 10:13 PM Jack S. wrote:
> Our investigation into this issue has indicated an internal issue which impacted the provisioning times for NLB in the us-east-1 Region. The issue had been resolved as of  2019-08-28 11:32:10AM GMT-0700.

But we've seen three 20+m timeouts today since then [1].  I'm continuing to follow up with them.

On the bright side, overall incidence of these timeouts is down in CI.

[1]: https://ci-search-ci-search-next.svc.ci.openshift.org/chart?name=aws&maxAge=12h&search=timeout%20while%20waiting%20for%20state.*20m

Comment 24 W. Trevor King 2019-08-29 22:05:57 UTC
> Abhinav, can we explicitly define the number of clusters we support concurrently in AWS?

Clearing the NEEDINFO, at least this most recent flare-up seems unrelated to concurrent cluster counts.

On the CI-health front:

$ curl -s 'https://ci-search-ci-search-next.svc.ci.openshift.org/search?maxAge=24h&context=3&search=timeout+while+waiting+for+state' | jq -r '. | to_entries[].value | to_entries[].value[].context[]' | sed -n 's|.*in resource \\\"\([^\]*\).*|\1|p' | sort | uniq -c | sort -n
      4 aws_ami_copy
      4 aws_route_table_association
      8 aws_lb_listener
     38 aws_lb

is a lot less than:

$ curl -s 'https://ci-search-ci-search-next.svc.ci.openshift.org/search?maxAge=48h&context=3&search=timeout+while+waiting+for+state' | jq -r '. | to_entries[].value | to_entries[].value[].context[]' | sed -n 's|.*in resource \\\"\([^\]*\).*|\1|p' | sort | uniq -c | sort -n
      1 aws_route
      1 aws_security_group
      2 aws_iam_instance_profile
      6 aws_subnet
      7 aws_internet_gateway
      8 aws_ami_copy
      8 aws_lb_listener
     20 aws_route_table_association
    288 aws_lb

on the aws_lb front; so we're mostly out of the fire.  AWS ticket is still ongoing, as mentioned in the previous comment.  aws_lb_listener issues are new in the past 24 hours, so we might to spin off a new bug for them if we want to dig into it.  But 8 failures out of the ~1.2k AWS jobs we've run today isn't a major issue.

Comment 25 sheng.lao 2019-09-03 05:41:04 UTC
It works for days. so I move it to Status: `VERIFIED`

Comment 26 W. Trevor King 2019-09-10 19:27:00 UTC
Created attachment 1613717 [details]
List slow CI ELBs from the past 24 hours

Got a run of these today passing the 20m timeout.  AWS wants me to forward the ARNs to help with their internal debugging, and there were enough that it seemed worth scripting it out.  Script attached.  You need a bit of post-processing to dedup things like:

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-promote-openshift-machine-os-content-e2e-aws-4.1/11717
  artifacts: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-promote-openshift-machine-os-content-e2e-aws-4.1/11717/
  job type: e2e-aws
  install log: https://storage.googleapis.com/origin-ci-test/logs/release-promote-openshift-machine-os-content-e2e-aws-4.1/11717/artifacts/e2e-aws/installer/.openshift_install.log
  install log: 4
  created: arn:aws:elasticloadbalancing:us-east-1:...h7771-342ef-d9dnp-int/648d83c2fa6e4644
  slow: arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-wiwh7771-342ef-d9dnp-ext/48e24dedb81b6d47
  slow: arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-wiwh7771-342ef-d9dnp-int/648d83c2fa6e4644
  slow: arn:aws:elasticloadbalancing:us-east-1:460538899914:loadbalancer/net/ci-op-wiwh7771-342ef-d9dnp-ext/48e24dedb81b6d47

where the internal LB wasn't actually slow (Terraform just elided its ARN when announcing creation).  And because of our tag processing we attempted to delete the external LB twice (once via the VPC and once directly).

Comment 28 errata-xmlrpc 2019-10-16 06:31:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922