Bug 1877880 - Access attempt to api.* intermittently blocked or wrongly reported not working
Summary: Access attempt to api.* intermittently blocked or wrongly reported not working
Keywords:
Status: CLOSED DUPLICATE of bug 1878794
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-apiserver
Version: 4.6.z
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Luis Sanchez
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-10 16:20 UTC by Peter Larsen
Modified: 2020-10-02 14:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-02 14:11:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1878794 0 high CLOSED Azure: pod on the pod network cannot dial internal/external kube-apiserver load balancer 2021-02-22 00:41:40 UTC

Description Peter Larsen 2020-09-10 16:20:53 UTC
Description of problem:
Not sure if this is an installation or apiserver problem (error shows up in the apiserver). 

Using the following install-config.yaml, a fresh Azure MAG installation has a ton of failures:
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-external: failed to establish a TCP connection to api.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout
openshift-apiserver                                0s          Normal    ConnectivityRestored                         deployment/apiserver                                              Connectivity restored after 1.999024264s: load-balancer-api-external: tcp connection to api.sep102.rhocp.us:6443 succeeded
openshift-apiserver                                0s          Normal    ConnectivityRestored                         deployment/apiserver                                              Connectivity restored after 1.968294869s: load-balancer-api-internal: tcp connection to api-int.sep102.rhocp.us:6443 succeeded
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-internal: failed to establish a TCP connection to api-int.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-internal: failed to establish a TCP connection to api-int.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout
openshift-apiserver                                0s          Normal    ConnectivityRestored                         deployment/apiserver                                              Connectivity restored after 967.559682ms: load-balancer-api-internal: tcp connection to api-int.sep102.rhocp.us:6443 succeeded
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-external: failed to establish a TCP connection to api.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout
openshift-apiserver                                0s          Normal    ConnectivityRestored                         deployment/apiserver                                              Connectivity restored after 998.429181ms: load-balancer-api-external: tcp connection to api.sep102.rhocp.us:6443 succeeded
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-external: failed to establish a TCP connection to api.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout
openshift-apiserver                                0s          Normal    ConnectivityRestored                         deployment/apiserver                                              Connectivity restored after 4.044180138s: load-balancer-api-external: tcp connection to api.sep102.rhocp.us:6443 succeeded
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-internal: failed to establish a TCP connection to api-int.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout
openshift-apiserver                                0s          Normal    ConnectivityRestored                         deployment/apiserver                                              Connectivity restored after 1.997010696s: load-balancer-api-internal: tcp connection to api-int.sep102.rhocp.us:6443 succeeded
openshift-apiserver                                0s          Warning   ConnectivityOutageDetected                   deployment/apiserver                                              Connectivity outage detected: load-balancer-api-internal: failed to establish a TCP connection to api-int.sep102.rhocp.us:6443: dial tcp 10.1.10.4:6443: i/o timeout


install-config.yaml sniplet:
platform:
  azure:
    baseDomainResourceGroupName: MAG-GlobalDNS
    region: usgovvirginia
    cloudName: AzureUSGovernmentCloud
    virtualNetwork: vnet-bastion
    networkResourceGroupName: MAG-isobastion
    controlPlaneSubnet: isoControl
    computeSubnet: isoCompute
publish: Internal
=== EOF ===

Given "publish" the external LB does not allow ingress traffic and all access through it will be blocked. api-int should succeed but that seems blocked too. 

Version-Release number of selected component (if applicable):


How reproducible:
Every time with the latest 4.6 nightly builds

Steps to Reproduce:
1. Install using the above sniplet on MAG

Comment 1 Peter Larsen 2020-09-16 21:40:56 UTC
h/t @mary Newby for finding the BZ https://bugzilla.redhat.com/show_bug.cgi?id=1878794 - this seems very much related.

Comment 2 Peter Larsen 2020-09-16 21:47:24 UTC
Tested with "publish: External" and the resulting cluster has the same problem. Consistent with BZ 1878794.

Comment 3 Luis Sanchez 2020-10-02 14:11:32 UTC

*** This bug has been marked as a duplicate of bug 1878794 ***


Note You need to log in before you can comment on or make changes to this bug.