Bug 1786675 - IPI on upshift openstack failed due to 'Security group rule already exists' error
Summary: IPI on upshift openstack failed due to 'Security group rule already exists' e...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.0
Assignee: Adolfo Duarte
QA Contact: David Sanz
URL:
Whiteboard:
Depends On:
Blocks: 1788062 1788585
TreeView+ depends on / blocked
 
Reported: 2019-12-27 03:33 UTC by Johnny Liu
Modified: 2020-04-22 21:54 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1788062 1788585 (view as bug list)
Environment:
Last Closed: 2020-04-22 21:54:12 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift installer pull 2878 None closed Bug 1786675: OpenStack: create security rules sequentially 2020-07-21 02:44:41 UTC

Description Johnny Liu 2019-12-27 03:33:16 UTC
Description of problem:

Version-Release number of the following components:
4.3.0-0.nightly-2019-12-25-124912

How reproducible:
Always

Steps to Reproduce:
1. Trigger a ipi install on upshift OSP
2.
3.

Actual results:
Installation failed with the following terraform error log:
<--snip-->
level=debug msg="module.masters.openstack_compute_instance_v2.master_conf[1]: Creation complete after 49s [id=37039f09-bd05-4815-8c63-989cf0c0eeb0]"
level=debug msg="module.bootstrap.openstack_compute_instance_v2.bootstrap: Creation complete after 49s [id=ec459c7d-fca2-4713-9b8d-30593aaa45fc]"

level=debug msg="module.masters.openstack_compute_instance_v2.master_conf[0]: Still creating... [50s elapsed]"

level=debug msg="module.masters.openstack_compute_instance_v2.master_conf[0]: Creation complete after 59s [id=0790db4d-5277-4186-ae08-9194e9d060b1]"
level=error
level=error msg="Error: Error creating openstack_networking_secgroup_rule_v2: Expected HTTP response code [] when accessing [POST https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13696/v2.0/security-group-rules], but got 409 instead"
level=error msg="{\"NeutronError\": {\"message\": \"Security group rule already exists. Rule id is 566d9f7c-a8d5-476c-8073-2ff193f0fe25.\", \"type\": \"SecurityGroupRuleExists\", \"detail\": \"\"}}"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-677071673/topology/sg-master.tf line 231, in resource \"openstack_networking_secgroup_rule_v2\" \"master_ingress_kubelet_secure_from_worker\":"
level=error msg=" 231: resource \"openstack_networking_secgroup_rule_v2\" \"master_ingress_kubelet_secure_from_worker\" {"
level=error
level=error
level=error
level=error msg="Error: Error creating openstack_networking_secgroup_rule_v2: Expected HTTP response code [] when accessing [POST https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13696/v2.0/security-group-rules], but got 409 instead"
level=error msg="{\"NeutronError\": {\"message\": \"Security group rule already exists. Rule id is b0efd6d6-e5e6-4a70-afd8-5d64fb08009b.\", \"type\": \"SecurityGroupRuleExists\", \"detail\": \"\"}}"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-677071673/topology/sg-master.tf line 261, in resource \"openstack_networking_secgroup_rule_v2\" \"master_ingress_services_udp\":"
level=error msg=" 261: resource \"openstack_networking_secgroup_rule_v2\" \"master_ingress_services_udp\" {"
level=error
level=error
level=error
level=error msg="Error: Error creating openstack_networking_secgroup_rule_v2: Expected HTTP response code [] when accessing [POST https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13696/v2.0/security-group-rules], but got 409 instead"
level=error msg="{\"NeutronError\": {\"message\": \"Security group rule already exists. Rule id is 54b43ba0-a394-4591-86b9-bc8592a7ba70.\", \"type\": \"SecurityGroupRuleExists\", \"detail\": \"\"}}"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-677071673/topology/sg-master.tf line 271, in resource \"openstack_networking_secgroup_rule_v2\" \"master_ingress_vrrp\":"
level=error msg=" 271: resource \"openstack_networking_secgroup_rule_v2\" \"master_ingress_vrrp\" {"
level=error
level=error
level=error
level=error msg="Error: Error creating openstack_networking_secgroup_rule_v2: Expected HTTP response code [] when accessing [POST https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13696/v2.0/security-group-rules], but got 409 instead"
level=error msg="{\"NeutronError\": {\"message\": \"Security group rule already exists. Rule id is bdcb485d-763e-41ea-9afa-2258284824d7.\", \"type\": \"SecurityGroupRuleExists\", \"detail\": \"\"}}"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-677071673/topology/sg-worker.tf line 19, in resource \"openstack_networking_secgroup_rule_v2\" \"worker_ingress_ssh\":"
level=error msg="  19: resource \"openstack_networking_secgroup_rule_v2\" \"worker_ingress_ssh\" {"
level=error
level=error
level=error
level=error msg="Error: Error creating openstack_networking_secgroup_rule_v2: Expected HTTP response code [] when accessing [POST https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13696/v2.0/security-group-rules], but got 409 instead"
level=error msg="{\"NeutronError\": {\"message\": \"Security group rule already exists. Rule id is 0ffcba9c-a3f1-4a8d-99ec-4d9817671ecd.\", \"type\": \"SecurityGroupRuleExists\", \"detail\": \"\"}}"
level=error
level=error msg="  on ../../../../../tmp/openshift-install-677071673/topology/sg-worker.tf line 150, in resource \"openstack_networking_secgroup_rule_v2\" \"worker_ingress_kubelet_insecure\":"
level=error msg=" 150: resource \"openstack_networking_secgroup_rule_v2\" \"worker_ingress_kubelet_insecure\" {"
level=error
level=error
level=fatal msg="failed to fetch Cluster: failed to generate asset \"Cluster\": failed to create cluster: failed to apply using Terraform"

Expected results:
Install succeed.

Additional info:
After the failure, use neutron client against upshift openstack, the security group rules are indeed already created. I am not sure if this is something about upshift openstack performance issue.

Comment 9 David Sanz 2020-01-08 09:50:05 UTC
Verified on 4.4.0-0.nightly-2020-01-08-072157

Comment 13 Johnny Liu 2020-01-08 11:25:57 UTC
Hmm, seem like upshift did some enhancement to make performance better, now I can not reproduce this bug any more. I tried the following builds, all succeeded.
4.2.1
4.4.0-0.nightly-2020-01-07-172830
4.4.0-0.nightly-2020-01-05-221122
4.3.0-0.nightly-2020-01-08-005052
4.3.0-0.nightly-2020-01-01-081457
4.3.0-0.nightly-2020-01-06-005750


In the above build, only 4.4.0-0.nightly-2020-01-07-172830 has the fix PR. 

What QE can do now is ensure the fix PR does not introduce any regression issue.


Note You need to log in before you can comment on or make changes to this bug.