Bug 1842741 - DNS operator performs spurious updates in response to API's defaulting of service's session affinity & type or daemonset's volumes' default modes
Summary: DNS operator performs spurious updates in response to API's defaulting of ser...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Miciah Dashiel Butler Masters
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-02 02:33 UTC by Miciah Dashiel Butler Masters
Modified: 2022-08-04 22:39 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: When the DNS operator reconciles a DNS or Service object, the operator determines whether it needs to update the object by constructing an expected object in memory, getting the actual object from the API, and comparing the two. The operator leaves some values unspecified in its expected DNS and Service objects. When the API set default values for these unspecified values, the comparison would return a false positive. Consequence: The operator was repeatedly trying to update DNS and Service objects in response to the API's setting default values. Fix: The operator now considers unspecified values and default values to be equal when comparing DNS and Service objects. Result: The operator should no longer update a DNS or Service object in response to API defaulting.
Clone Of:
Environment:
Last Closed: 2020-10-27 16:03:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-dns-operator pull 174 0 None closed Bug 1842741: Fix serviceChanged and daemonsetConfigChanged 2020-07-14 06:04:07 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:03:35 UTC

Description Miciah Dashiel Butler Masters 2020-06-02 02:33:09 UTC
Description of problem:

When the DNS operator reconciles the DNS, the operator gets the DNS's daemonset and service (if they exist) from the API to determine whether the operator needs to create or update them.  If the daemonset or service does not exist, the operator creates it, with empty values for some API fields, such as the spec.sessionAffinity and spec.type fields on the service.  If the daemonset or service does exist, the operator compares it with what the operator expects to get in order to determine whether an update is needed for the object.  In this comparison, if the API has set the default value for the daemonset's volumes' default mode fields or the service's spec.sessionAffinity and spec.type fields, the operator detects the update and tries to set the fields back to the empty value.  The operator should not update the daemonset or service in response to API defaulting.


Steps to Reproduce:

1. Launch a new cluster.

2. Modify the default DNS service's session affinity:

    oc -n openshift-dns patch svc/dns-default --type=strategic --patch='{"spec":{"sessionAffinity":"ClientIP"}}'

3. Check the DNS operator's logs:

    oc -n openshift-dns-operator logs deploy/dns-operator -c dns-operator


Actual results:

The DNS operator's logs repeat "updated dns service" and "updated dns daemonset" multiple times.


Expected results:

The DNS operator should ignore when the API sets default values and should not log "updated dns daemonset" or "updated dns service" unless the daemonset or service is updated outside of API defaulting.

Comment 1 Miciah Dashiel Butler Masters 2020-06-18 19:20:55 UTC
A PR is posted and awaiting review.  We'll merge it next sprint.

Comment 2 Andrew McDermott 2020-07-09 12:03:36 UTC
Iā€™m adding UpcomingSprint, because I was occupied by fixing bugs with
higher priority/severity, developing new features with higher
priority, or developing new features to improve stability at a macro
level. I will revisit this bug next sprint.

Comment 6 Hongan Li 2020-07-16 06:45:09 UTC
Verified with 4.6.0-0.nightly-2020-07-15-170241 and issue has been fixed.
Follow the reproduce step and just see one log of "updated dns service: openshift-dns/dns-default".

In another 4.5 cluster without the fix, we can see multiple logs as below:
time="2020-07-16T06:36:05Z" level=info msg="updated dns daemonset: openshift-dns/dns-default"
time="2020-07-16T06:36:05Z" level=info msg="updated dns service: openshift-dns/dns-default"
time="2020-07-16T06:36:06Z" level=info msg="updated dns daemonset: openshift-dns/dns-default"
time="2020-07-16T06:36:06Z" level=info msg="updated dns service: openshift-dns/dns-default"

Comment 8 errata-xmlrpc 2020-10-27 16:03:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.