1657769 – sync-pod tried to override the hostname

Bug 1657769 - sync-pod tried to override the hostname

Summary: sync-pod tried to override the hostname

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	3.11.z
Assignee:	Scott Dodson
QA Contact:	Weihua Meng
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1657768 (view as bug list)
Depends On:
Blocks:	1687803
TreeView+	depends on / blocked

Reported:	2018-12-10 12:11 UTC by Fatima
Modified:	2019-04-11 05:38 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	If an override hostname hadn't been set the sync script generated an error in the sync logs. That error message is no longer present in situations where /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE is not present.
Clone Of:
Clones:	1687803 (view as bug list)
Environment:
Last Closed:	2019-04-11 05:38:23 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0636	0	None	None	None	2019-04-11 05:38:38 UTC

Description Fatima 2018-12-10 12:11:38 UTC

Description of problem:

In the /var/log/messages i have found the following message:
Dec  7 14:28:56 node.example.com journal: cat: 
/etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE: No such file or directory
Dec  7 14:28:57 node.example.com journal: node/node.example.com annotated

I have checked the file sync.yaml and found that it will not be checked if the files exist. (line 115)
https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_node_group/files/sync.yaml#L115


Version-Release number of selected component (if applicable):

OCP v3.11


Actual results: 
Dec  7 14:28:56 node.example.com journal: cat: 
/etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE: No such file or directory

Expected results:
This should not be checked as the file is present.

Comment 1 Fatima 2018-12-10 12:29:00 UTC

Correction:

Expected results:
This should be checked as the file is present.

Comment 2 Eric Paris 2018-12-10 20:13:21 UTC

*** Bug 1657768 has been marked as a duplicate of this bug. ***

Comment 3 Fatima 2018-12-31 11:59:02 UTC

Hi team,

Any updates?

Comment 4 Fatima 2018-12-31 11:59:31 UTC

Hi team,

Any updates?

Comment 6 Scott Dodson 2019-01-03 18:37:50 UTC

Can you detail what undesired behavior is actually being experienced? Reading this bug it appears that it's just a log entry emitted periodically to the journal.

Comment 9 cjonagam 2019-01-20 15:10:55 UTC

im seeing the same issue.

Comment 11 Alberto Gonzalez de Dios 2019-02-13 13:10:46 UTC

@Scott, Michael: Could you confirm this solution? I already tested in a lab, and I can write a KCS based on this solution. 

- For install/upgrade:
Change line 115 of roles/openshift_node_group/files/sync:

Before:
KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :

Now:
KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE 2>/dev/null) || :

- For a working cluster:
$ oc edit daemonset sync -n openshift-node

Search for line KUBELET_HOSTNAME_OVERRIDE:

Before:
KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :

Now:
KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE 2>/dev/null) || :


Note: This "issue" was introduced in PR 10343 [1]. Although it's just an error log line, it can generate too many error lines that are not useful.

https://github.com/openshift/openshift-ansible/pull/10343

Comment 13 Alberto Gonzalez de Dios 2019-03-05 09:33:48 UTC

I've just created KCS 3958481 [1] regarding this issue and solution. I'll also submit a PR to fix this issue. 

[1] https://access.redhat.com/solutions/3958481

Comment 14 Scott Dodson 2019-03-11 18:38:34 UTC

https://github.com/openshift/openshift-ansible/pull/11337

Comment 15 Weihua Meng 2019-03-14 08:38:16 UTC

I cannot reproduce this issue.

Could you provide how to reproduce it? 

Thanks.

openshift-ansible-3.11.70-1.git.0.aa15bf2.el7

# cat /etc/sysconfig/atomic-openshift-node
DEBUG_LOGLEVEL=8

# cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE
cat: /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE: No such file or directory

# oc get daemonset.apps/sync -o yaml | grep KUBELET_HOSTNAME_OVERRIDE
            KUBELET_HOSTNAME_OVERRIDE=$(cat /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE) || :

# grep /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE /var/log/messages | grep "No such file" | wc -l
0

Comment 16 Alberto Gonzalez de Dios 2019-03-14 12:00:24 UTC

Hi Weihua,

I cannot get error now in /var/log/messages, but I can get with oc logs:

[user@master-0 ~]$ oc get pods
NAME         READY     STATUS    RESTARTS   AGE
sync-dmq8n   1/1       Running   0          1m
sync-jkgjn   1/1       Running   0          1m
sync-vm4dm   1/1       Running   0          1m

[user@master-0 ~]$ oc logs sync-jkgjn
cat: /etc/sysconfig/KUBELET_HOSTNAME_OVERRIDE: No such file or directory
info: Configuration changed, restarting kubelet
info: Applying node labels node-role.kubernetes.io/master=true

After fix, you shouldn't get any error (I fix with oc edit ds sync):
[user@master-0 ~]$ oc edit ds sync -n openshift-node
daemonset.extensions/sync edited
[user@master-0 ~]$ oc get pods -n openshift-node
NAME         READY     STATUS        RESTARTS   AGE
sync-jkgjn   1/1       Running       0          7m
sync-vm4dm   0/1       Terminating   0          7m
[user@master-0 ~]$ oc get pods -n openshift-node
NAME         READY     STATUS              RESTARTS   AGE
sync-b6whk   1/1       Running             0          11s
sync-gj4vs   0/1       ContainerCreating   0          0s
sync-xz2vj   1/1       Running             0          11s
[user@master-0 ~]$ oc logs sync-b6whk -n openshift-node 
info: Configuration changed, restarting kubelet
info: Applying node labels node-role.kubernetes.io/infra=true 
[user@master-0 ~]$

Comment 17 Weihua Meng 2019-03-15 01:39:27 UTC

Thanks, Alberto.


Fixed.

openshift-ansible-3.11.95-1.git.0.d080cce.el7

Comment 19 Weihua Meng 2019-03-21 02:13:57 UTC

Fixed.

openshift-ansible-3.11.98-1.git.0.3cfa7c3.el7

# oc logs -n openshift-node sync-x2zz5
info: Configuration changed, restarting kubelet
info: Applying node labels role=node registry=enabled router=enabled 
node/ip-172-18-2-224.ec2.internal labeled
node/ip-172-18-2-224.ec2.internal annotated
node/ip-172-18-2-224.ec2.internal annotated
node/ip-172-18-2-224.ec2.internal annotated
node/ip-172-18-2-224.ec2.internal annotated

Comment 21 errata-xmlrpc 2019-04-11 05:38:23 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0636

Note You need to log in before you can comment on or make changes to this bug.