Description of problem: When an mDNS host attempts to publish a SRV record, the mdns plugin to coredns is overwriting the specified value before returning it in queries. This behavior was added to address a use case that no longer exists (and was solved in a different way in any case), so we no longer need it. In addition it breaks some new functionality which requires that SRV records be published verbatim.
To reproduce, have a host publish a SRV mdns record for coredns-mdns to consume, then query for that record. If the original record is named foo.example.com, the value returned from the query will be something like etcd-foo.example.com.
Is https://github.com/openshift/coredns/pull/25 a fix for this? It's not associated with the bz.
Toni, can you handle this one please?
Yes, 25 is the fix for this, although I guess I need a copy of the bug targeted against 4.5 as well. I'll get it cloned.
Oh, there is no 4.5 version available on bz yet. That's why I hadn't opened anything for it. This one was to allow backporting since we'll need it in 4.4, which is why I didn't reference it in the 4.5 PR.
[kni@provisionhost-0-0 ~]$ oc version
Client Version: 4.5.0-0.nightly-2020-05-20-053050
Server Version: 4.5.0-0.nightly-2020-05-20-053050
Kubernetes Version: v1.18.2
According to Ben Nemec:
For the purposes of this bug, the main thing is to verify that the SRV record points at what you specify, not the hard-coded CNAME from coredns-mdns.
In order to verify:
1.login into master-0 from kni@provisionhost: ssh core@master-0
2.gain root access: sudo -s
3.copy the mdns configuration: cp /etc/mdns/config.hcl /etc/mdns/config1.hcl
4.change the host_name in /etc/mdns/config1.hcl to another name (for example:"master-0-0-0")
5.run: sudo crictl ps | grep mdns
6bee170740088 93b7d3550406466da140ce16bdbe635240993eaf6bd91df0bf8eec9bf3605ab2 12 minutes ago
Running mdns-publisher 1 7e88cff203314
6.sudo crictl stop 6bee170740088
7.host -t SRV _etcd-server-ssl._tcp.ocp-edge-cluster-0.qe.lab.redhat.com
_etcd-server-ssl._tcp.ocp-edge-cluster-0.qe.lab.redhat.com has SRV record 0 10 2380 master-0-2.ocp-edge-cluster-0.qe.lab.redhat.com.
_etcd-server-ssl._tcp.ocp-edge-cluster-0.qe.lab.redhat.com has SRV record 0 10 2380 master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com.
_etcd-server-ssl._tcp.ocp-edge-cluster-0.qe.lab.redhat.com has SRV record 0 10 2380 master-0-0-0.ocp-edge-cluster-0.qe.lab.redhat.com.
Note that the last record was updated to the new name in the config.hcl
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.