Bug 1812409
| Summary: | cannot install on BM or with dev-scripts due to missing etcd mdns records | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Yuval Kashtan <ykashtan> | 
| Component: | Installer | Assignee: | Ben Nemec <bnemec> | 
| Installer sub component: | OpenShift on Bare Metal IPI | QA Contact: | Amit Ugol <augol> | 
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | asegurap, deads, kboumedh, m.andre | 
| Version: | 4.4 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-03-11 13:50:51 UTC | Type: | Bug | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| 
 
        
          Description
        
        
          Yuval Kashtan
        
        
        
        
        
          2020-03-11 09:37:03 UTC
        
       
      
      
      
    If kube-apiserver is hardcoding in 4.4 (not 4.5) to etc-0, etcd-1, etcd-2 somewhere we should fix it to check the SRV record instead. A workaround is to do the following on each master:
- Add the following snippet in /etc/mdns/config.hcl (at the end of the file), by setting the proper number depending on the master.
service {
    name = "ostest EtcdWorkstation"
    host_name = "etcd-$NUMBER.local."
    type = "_workstation._tcp"
    domain = "local."
    port = 42424
    ttl = 300
}
- Make sure mdns publisher static pod is restarted with
crictl stop $(crictl ps | grep mdns | cut -f1 -d" ")
    They are not hardcoded. Since the beginning of 4.x, we harvest the names from the endpoints and the DNS entries are present: https://github.com/openshift/cluster-kube-apiserver-operator/blob/release-4.1/pkg/operator/configobservation/etcd/observe_etcd.go#L58-L68 . In fact, prior to 4.4, it was actually impossible to even start etcd without having DNS entries working, because we used the etcd DNS discovery mechanism. If you're having trouble with this, you probably want to figure out what is wrong with mDNS. Separately, in 4.4 we developed an operator that was able to remove the long-standing etcd DNS dependency and we merged the backport of a 4.5 change to the kube-apiserver-operator to use IP addresses (https://github.com/openshift/cluster-kube-apiserver-operator/pull/792), but you should sort out why your DNS is broken. I'm not sure what else in the stack is going to break for you. Note that this affected more than just BM. A similar error was reported in https://bugzilla.redhat.com/show_bug.cgi?id=1811530. Closing as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1812071 tracking the backport to 4.4. *** This bug has been marked as a duplicate of bug 1812071 ***  |