Bug 1989403 - Machine-config degraded for unknown reason [NEEDINFO]
Summary: Machine-config degraded for unknown reason
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Sinny Kumari
QA Contact: Rio Liu
URL:
Whiteboard:
Depends On: 1918440
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-03 07:12 UTC by Jaspreet Kaur
Modified: 2021-09-09 01:53 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-09 01:52:52 UTC
Target Upstream Version:
jerzhang: needinfo? (jkaur)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2720 0 None None None 2021-08-18 10:42:05 UTC
Red Hat Product Errata RHBA-2021:3395 0 None None None 2021-09-09 01:53:14 UTC

Description Jaspreet Kaur 2021-08-03 07:12:17 UTC
Description of problem : User recently upgraded where several issues were there initially which got resolved and then machine-config-daemon restart and when checking logs :

I0803 06:06:40.894442    3790 update.go:1270] /etc/systemd/system/multi-user.target.wants/node-valid-hostname.service already exists. Not making a new symlink
I0803 06:06:40.894466    3790 update.go:1357] Writing systemd unit "nodeip-configuration.service"
I0803 06:06:40.895754    3790 update.go:1290] /etc/systemd/system/multi-user.target.wants/nodeip-configuration.service was not present. No need to remove
I0803 06:06:40.895918    3790 update.go:1279] Enabled openvswitch.service
I0803 06:06:40.895939    3790 update.go:1357] Writing systemd unit "ovs-configuration.service"
I0803 06:06:40.897502    3790 update.go:1270] /etc/systemd/system/multi-user.target.wants/ovs-configuration.service already exists. Not making a new symlink
I0803 06:06:40.897524    3790 update.go:1323] Writing systemd unit dropin "10-ovs-vswitchd-restart.conf"
I0803 06:06:40.899093    3790 update.go:1323] Writing systemd unit dropin "10-ovsdb-restart.conf"
I0803 06:06:40.900679    3790 update.go:1279] Enabled ovsdb-server.service
I0803 06:06:40.900702    3790 update.go:1323] Writing systemd unit dropin "10-mco-default-env.conf"
I0803 06:06:40.902350    3790 update.go:1323] Writing systemd unit dropin "mco-disabled.conf"
I0803 06:06:40.904271    3790 update.go:1279] Enabled usbguard.service
I0803 06:06:40.904293    3790 update.go:1121] Deleting stale data
E0803 06:06:40.904426    3790 writer.go:135] Marking Degraded due to: exit status 1


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Upgrade cluster to 4.6
2. It failed for unknown reason
3.

Actual results: Machine-config degarded for worker for unknown reason.


Expected results: Should have given messages for failure or there should be smooth upgrade.


Additional info:

Comment 3 Sinny Kumari 2021-08-17 14:50:44 UTC
From the MCD pod logs:
...
I0803 06:00:47.350972    3790 update.go:1676] Starting update from rendered-worker-7298552652d8f48f49c045e906ef5fa3 to rendered-worker-7298552652d8f48f49c045e906ef5fa3: &{osUpdate:false kargs:false fips:false passwd:false files:false units:false kernelType:false extensions:false}
...
I0803 06:04:00.049156    3790 update.go:1676] Running rpm-ostree [kargs --delete=audit_backlog_limit=8192 --delete=audit=1 --delete=nousb --delete=page_poison=1 --delete=pti=on --delete=vsyscall=none --append=audit_backlog_limit=8192 --append=audit=1 --append=nousb --append=page_poison=1 --append=pti=on --append=vsyscall=none]
I0803 06:04:00.285518    3790 update.go:375] Rolling back applied changes to OS due to error: exit status 1
I0803 06:04:00.285573    3790 rpm-ostree.go:261] Running captured: rpm-ostree cleanup -p
...

This seems like it is an instance of bug https://bugzilla.redhat.com/show_bug.cgi?id=1918440 , where we are un-necessarly updating kernel Args while there is no changes made.

Comment 8 errata-xmlrpc 2021-09-09 01:52:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.44 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3395


Note You need to log in before you can comment on or make changes to this bug.