Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1506418

Summary: Use FQDN for OPENSHIFT_DEFAULT_REGISTRY to avoid extra DNS query
Product: OpenShift Container Platform Reporter: Takayoshi Kimura <tkimura>
Component: RFEAssignee: Dirk Herrmann <dherrman>
Status: CLOSED WONTFIX QA Contact: Xiaoli Tian <xtian>
Severity: high Docs Contact:
Priority: low    
Version: 3.6.0CC: aos-bugs, bparees, dherrman, dsundqvi, jokerman, ksuzumur, mfojtik, mmccomas, suchaudh
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-11 13:54:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takayoshi Kimura 2017-10-26 01:07:06 UTC
Description of problem:

Currently it uses docker-registry.default.svc.

OPENSHIFT_DEFAULT_REGISTRY=docker-registry.default.svc:5000

From the the docker daemon on docker host, this DNS query goes to dnsmasq (docker host doesn't have /etc/resolv.conf ndots option so raw query first then search), and it forwarded to upstream DNS because there is no ".cluster.local". Next would be docker-registry.default.svc.your.domain.example.com, then ".cluster.local" appended from the search and finally hits OpenShift DNS.

From the container it goes through search list and goes to upstream like docker-registry.default.svc.your.domain.example.com, then OpenShift DNS.

We can specify FQDN and avoid these extra DNS queries.


Version-Release number of the following components:

$ rpm -q openshift-ansible
openshift-ansible-3.6.173.0.21-2.git.0.44a4038.el7.noarch
$ rpm -q ansible
ansible-2.3.2.0-2.el7.noarch
$ ansible --version
ansible 2.3.2.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides
  python version = 2.7.5 (default, May  3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)]


How reproducible:

Always

Steps to Reproduce:
1.
2.
3.

Actual results:

Query for OpenShift DNS goes to upstream DNS multiple times.

Expected results:

Query for OpenShift DNS goes to OpenShift DNS directly.

Additional info:

Comment 1 Scott Dodson 2017-10-26 13:25:17 UTC
Is there a measurable impact of this?

It was a deliberate decision not to use the FQDN because it allowed flexibility to change cluster dns post installation and I believe there were other reasons too.

We could potentially allow the admin to configure an option to use the FQDN but I'm not sure we want to do that just to save 1 dns resolution.

NEEDINFO on michal who was involved in the decision to rely on search path.

Comment 4 David Sundqvist 2017-11-30 08:24:03 UTC
For a situation with DNS blackholing, where the upstream DNS would respond with an address rather than nxdomain for docker-registry.default.svc, I added a configuration file to /etc/dnsmasq.d containing:

server=/default.svc/

which will cause the first lookup to always immediately fail.

That should be applicable in other cases where you don't want the initial lookup to be sent upstream as well.

I think another reason why the choice not to use FQDN was made was that it'd break the secrets.

Comment 5 Scott Dodson 2017-11-30 16:54:41 UTC
Right, secrets are not generated for the FQDN. I'm going to move this over to the registry component so that they can consider the next steps here, we can't implement this change in the installer until they ensure secrets are generated too.

Comment 9 Sudarshan Chaudhari 2018-11-27 16:50:14 UTC
Hello, 

We are facing the issue similar to what mentioned in the Bugzilla. 

In our DNS server, we have the entry for *.svc which points to an IP address which is not the part of OCP cluster and it is used for different application. 

We have all the configurations of DNS in network config and DNSMASQ properly configured but route docker-registry.default.svc is not getting resolved to its svc's ClusterIP instead it is getting resolved to IP from upstream DNS. Due to this, the Upgrade is failing while checking the component health. for this we have to manually add the entry "server=/.svc/127.0.0.1/" in the dnsmasq config to make it work. Also using the complete route like: docker-registry.default.svc.cluster.local resolves to ClusterIP of svc. 

Our query: why the route is going to upstream server instead of getting it searched by the search parameter in resolv.conf?

Is there any alternative fix for this other than adding the line to dnsmasq config?

Thanks.

Comment 13 Dirk Herrmann 2019-06-11 13:54:16 UTC
You can change your registry url to avoid this problem but since this RFE is relatively unique to your environment it does not justify us reworking the installer. Closing it with won't fix.