Bug 1689857

Summary: The instance security group do not have 'echo request' rule for ICMP in Inbound
Product: OpenShift Container Platform Reporter: zhaozhanqi <zzhao>
Component: InstallerAssignee: Casey Callendrello <cdc>
Installer sub component: openshift-installer QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: aos-bugs, bbennett, bleanhar, cdc, erich, gpei, jokerman, mmccomas, nstielau, sdodson
Version: 4.1.0   
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:46:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description zhaozhanqi 2019-03-18 09:54:07 UTC
Description of problem:
When setup the openshift cluster 4.0 on AWS. The master and workers cannot ping each other using the internal-ip.

Found the security group only have 'echo reply' rule in 'Inbound'. When I added the 'echo request' rule by manual. then they are able to ping each other.


Version-Release number of selected component (if applicable):
4.0.0-0.nightly-2019-03-15-063749

How reproducible:
always

Steps to Reproduce:
1. see the Description
2.
3.

Actual results:

master and workers cannot ping using the internal-ip each other, thus it makes hard when debugging some issue.

Expected results:
master and workers can ping each other.


Additional info:
I checked one blog https://charity.wtf/2016/04/14/scrapbag-of-useful-terraform-tips
The from_port should be 8 other than 0 in the openshift-installer/data/data/aws/vpc/sg-worker.tf
  
  resource "aws_security_group_rule" "worker_ingress_icmp" {
  type              = "ingress"
  security_group_id = "${aws_security_group.worker.id}"

  protocol    = "icmp"
  cidr_blocks = ["0.0.0.0/0"]
  from_port   = 0
  to_port     = 0

Comment 1 Nick Stielau 2019-03-18 17:27:29 UTC
Can you describe the impact of this?  Is the cluster functional?  Is it harder to debug?  Perhaps use the 'As an TYPE_OF_USER, I want to ping between masters and works, so that I can USER_GOAL' format.

Comment 2 Meng Bo 2019-03-19 03:08:15 UTC
Hi Nick,

We(QE) will try to debug the cluster network sometimes, and the node to node connectivity is one of the checkpoint.

Beside above, I'd like to know the reason that we set the `ICMP reply` rule only which may not make the ping works.
And since the nodes will not have the public IP, why we set the cidr block to 0.0.0.0/0 instead of a vpc internal subnet or another security group?

Thanks

Comment 3 Casey Callendrello 2019-04-08 14:39:08 UTC
Hang on - if we really block ICMP between nodes, then it's definitely a bug. We 100% need ICMP internal to the VPC to be completely unblocked.

I'm checking now.

Comment 4 Scott Dodson 2019-04-08 19:41:21 UTC
Please let us know what you determine.

Comment 5 Casey Callendrello 2019-04-09 09:05:12 UTC
Filed PR https://github.com/openshift/installer/pull/1550

Comment 7 zhaozhanqi 2019-04-22 07:26:58 UTC
Tested this bug on 4.1.0-0.nightly-2019-04-22-005054, this issue had been fixed.

Comment 10 errata-xmlrpc 2019-06-04 10:46:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758