Bug 1956653 - Removal of CoreDNS pod from remote worker cause api_int resolution error
Summary: Removal of CoreDNS pod from remote worker cause api_int resolution error
Keywords:
Status: NEW
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Beth White
QA Contact: Rei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-04 07:40 UTC by Rei
Modified: 2021-05-12 07:03 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Rei 2021-05-04 07:40:02 UTC
Description of problem:
The epic that remove CoreDNS from worker and remove it overall cause api-int record to disappear  

Version-Release number of selected component (if applicable):


How reproducible:
You can see in this epic, in this epic coredns removed from workers and the workers can't resolved api-int
https://issues.redhat.com/browse/KNIDEPLOY-4329 

Steps to Reproduce:
1. $ ssh kni@provisionhost-0-0
2. $ cd ~/clusterconfigs/manifests
3. $ vi cluster-network-avoid-workers-99-config.yamlapiVersion: machineconfiguration.openshift.io/v1kind: MachineConfigmetadata:  name: 50-worker-fix-ipi-rwn  labels:    machineconfiguration.openshift.io/role: workerspec:  config:    ignition:      version: 3.1.0    systemd:      units:      - name: nodeip-configuration.service        enabled: true        contents: |          [Unit]          Description=Writes IP address configuration so that kubelet and crio services select a valid node IP          Wants=network-online.target          After=network-online.target ignition-firstboot-complete.service          Before=kubelet.service crio.service          [Service]          Type=oneshot          ExecStart=/bin/bash -c "exit 0 "          [Install]          WantedBy=multi-user.target    storage:      files:        - contents:            source: data:,            verification: {}          filesystem: root          mode: 420          path: /etc/kubernetes/manifests/keepalived.yaml        - contents:            source: data:,            verification: {}          filesystem: root          mode: 420          path: /etc/kubernetes/manifests/mdns-publisher.yaml        - contents:            source: data:,            verification: {}          filesystem: root          mode: 420          path: /etc/kubernetes/manifests/coredns.yaml---apiVersion: operator.openshift.io/v1kind: IngressControllermetadata:  name: default  namespace: openshift-ingress-operatorspec:  nodePlacement:    nodeSelector:      matchLabels:        node-role.kubernetes.io/master: "" 
4. $ cp install-config.yaml ~/clusterconfigs 
5. $ ./openshift-baremetal-install --dir ~/clusterconfigs create manifests
6. Configure master nodes to be schedulable (by configuring the mastersSchedulable field), meaning that new pods are allowed for placement on the master nodes. By default, master nodes are not schedulable. Set to true (mastersSchedulable) to allow master nodes to be schedulable: $ sed -i "s;mastersSchedulable: false;mastersSchedulable: true;g" clusterconfigs/manifests/cluster-scheduler-02-config.yml 9. Run the OpenShift Installer with the command - $ ./openshift-baremetal-install --dir ~/clusterconfigs --log-level debug create cluster

Actual results:
Worker does not deploy. The stuck on api-int resolve

Expected results:
This is the expected result we should understand where and how we provide api-int record to the system

Additional info:
You can bypass this issue if you add api-int record to the external dns server

Comment 1 Victor Voronkov 2021-05-04 08:09:34 UTC
with Ingress_VIP removal as default deployment mode, no worker will be successfully deployed, blocker flag set


Note You need to log in before you can comment on or make changes to this bug.