Bug 2075126 - validate and report error during kernelArguments mismatch
Summary: validate and report error during kernelArguments mismatch
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 4.11.z
Assignee: Colin Walters
QA Contact: Rio Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-13 16:39 UTC by Nabeel Cocker
Modified: 2023-09-18 04:35 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-24 04:14:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 3105 0 None Merged daemon: Validate kernel arguments 2022-05-04 13:52:48 UTC
Red Hat Issue Tracker COS-1869 0 None None None 2022-11-07 18:42:42 UTC
Red Hat Product Errata RHSA-2022:8535 0 None None None 2022-11-24 04:14:30 UTC

Description Nabeel Cocker 2022-04-13 16:39:29 UTC
OCP 4.7.37

This is an upgrade from OCP4.7.24 to OCP 4.7.37

Details are in this case : https://access.redhat.com/support/cases/#/case/03193685


Did you catch this issue by running a Jenkins job? If yes, please list:
1. Jenkins job:

2. Profile:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Comment 6 Colin Walters 2022-04-13 17:50:26 UTC
MachineConfigs have a set of kernel arguments provided.  Today the MCO manages updates to kernel arguments by (whenever there's a change):

- remove all kargs present in old config
- add all kargs present in new config

(It could do a diff, but it's simpler to just hand all it off to rpm-ostree which handles the idempotency aspect of --remove=foo --add=foo)

Now somehow, the node is missing a kernel argument that the MCO thought should be present in the previous configuration.

There are a few possibilities.  One is that the kernel arguments were manually edited.  Another is that we have a "torn configuration" (xref https://github.com/openshift/machine-config-operator/issues/1190 )
where the MCO thought it applied a change but on reboot was still in the old config.  Although I think we should have hit a validation error before trying the update.

It *might* work to manually
rpm-ostree kargs --append=skew_tick=1

(So that a later MCO change would undo it)

If that doesn't work, I think a workaround would be to manually edit the kargs files:

```
$ unshare -m 
$ mount -o remount,rw /boot
$ vi /boot/loader/entries/$config
```

And inject skew_tick=1 there, then `systemctl restart rpm-ostreed`.

Comment 7 Jonathan Lebon 2022-04-13 18:19:42 UTC
(In reply to Colin Walters from comment #6)
> MachineConfigs have a set of kernel arguments provided.  Today the MCO
> manages updates to kernel arguments by (whenever there's a change):
> 
> - remove all kargs present in old config
> - add all kargs present in new config
> 
> (It could do a diff, but it's simpler to just hand all it off to rpm-ostree
> which handles the idempotency aspect of --remove=foo --add=foo)
> 
> Now somehow, the node is missing a kernel argument that the MCO thought
> should be present in the previous configuration.
> 
> There are a few possibilities.  One is that the kernel arguments were
> manually edited.  Another is that we have a "torn configuration" (xref
> https://github.com/openshift/machine-config-operator/issues/1190 )
> where the MCO thought it applied a change but on reboot was still in the old
> config.  Although I think we should have hit a validation error before
> trying the update.
> 
> It *might* work to manually
> rpm-ostree kargs --append=skew_tick=1
> 
> (So that a later MCO change would undo it)
> 
> If that doesn't work, I think a workaround would be to manually edit the
> kargs files:
> 
> ```
> $ unshare -m 
> $ mount -o remount,rw /boot
> $ vi /boot/loader/entries/$config
> ```
> 
> And inject skew_tick=1 there, then `systemctl restart rpm-ostreed`.

Is another possibility here to relax the MCO check so that it doesn't barf if any kargs from the old config are missing? (I.e. use `--delete-if-present`)

Comment 8 Ben Nemec 2022-04-13 20:14:26 UTC
This is being looked at by the core MCO team since it's not specific to the baremetal platform. Changing the subcomponent to reflect that.

Comment 29 Colin Walters 2022-04-20 21:49:13 UTC
> **Why aren't we in state = Done?  This needs investigation.**

While it won't help at all immediately,
https://github.com/openshift/machine-config-operator/pull/3102
would help debug this in the future.

Comment 34 Colin Walters 2022-04-21 13:53:44 UTC
[There's no customer specific information in this sub-thread, switching to public comments where possible to aid future searches and just in the interest of default-to-public where possible]

> These issues smell like: https://bugzilla.redhat.com/show_bug.cgi?id=2024682

Yes, that looks exactly like *a* problem going on here.  Now, I do still think there's an MCO bug here, and as I mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=2075126#c29 I think it may be related to a potential race condition in the Done state flag.  I think the NTO flapping definitely makes it *much* more likely to provoke this bug though.

That said, there are also ostree-finalize-staged failures related to SELinux which may be the 3rd leg of this failure scenario, and I think it may be the confluence of all 3 of these issues that are impacting reliability.

Comment 52 Colin Walters 2022-04-27 22:14:32 UTC
Since we have information on this scattered around, I just want to explicitly note here that this issue appears to have been triggered by someone (or something) doing `chattr +i /etc/resolv.conf`.  That ultimately breaks ostree, but in a currently hard to detect way.  A code change already landed to help make seeing this *much* more visible: https://github.com/ostreedev/ostree/pull/2589 

We will be working to ensure that the MCO rolls up these types of failures and makes them much more visible.

In the meantime...the workaround is to remove the immutable bit, and the key fix is to find out what has been changing it and make it stop.  (The immutable bit on content in /etc clashes with how ostree maintains two copies of /etc; ultimately it's not going to work)

Marking this as NEEDINFO pending further diagnosis of what is setting that bit.  

In the meantime, I do consider this case a very bad failure; this was unnecessarily hard to diagnose.  Work will continue to land to make this easier to debug in the future.

Comment 53 brian 2022-05-04 18:59:17 UTC
The immutable bit was only found to be set on one node in one cluster. The several other nodes that data has been collected from did not have the immutable bit set and the work around worked as expected. We shouldn't require any more data for this.

Comment 55 Colin Walters 2022-05-05 20:31:03 UTC
There's a *lot* going on in this overall case; multiple interacting bugs.

I'm going to take this specific bugzilla to track https://github.com/openshift/machine-config-operator/pull/3105
which is on track to ship in 4.11.

As I said in the PR I feel this is likely to surface other deployments that have had similar issues, and I'm wary of trying to backport that specific patch really far back right now.

I don't feel that we still got to one single root cause.  We've found the immutable bit in some cases on /etc/resolv.conf, and we saw some failures related to /etc/selinux on another.  Unfortunately, these unmanaged changes to /etc are going to trigger ostree failures; there are things we could do to improve this, but ultimately to correctly manage upgrades we can't support things like the immutable bit on files in /etc.

We will work on continuing to make these failures much (much) more obvious and easy to discover.

If we find other things to improve on the MCO specifically, let's file new bugs.

Comment 56 Colin Walters 2022-05-05 21:06:25 UTC
I tested this systemd unit:

```
$ cat /etc/systemd/system/save-ostree-logs.service
[Service]
Type=oneshot
ExecStart=/bin/sh -c 'journalctl -u ostree-finalize-staged > /var/lib/save-ostree-logs/log'
StateDirectory=save-ostree-logs

[Install]
WantedBy=multi-user.target
```

That could be deployed via a MachineConfig or via ssh manually.

It's basically a way to defeat systemd's automatic log rotation for just that one service.

Again we're effectively doing this upstream soon in https://github.com/ostreedev/ostree/pull/2589
in a better way, but this will help us gather the data now.

Comment 62 Sinny Kumari 2022-10-24 13:22:02 UTC
All the customer cases attached on this bug has been resolved.

As an outcome of this bug (mentioned in comment#55), we are tracking debugging fix made in MCO with PR https://github.com/openshift/machine-config-operator/pull/3105 . Have updated, title of this bug accordingly.

Since MCO PR#3105 has been merged in OCP 4.11, moving this bug to ON_QA.
QA can reproduce the bug as described in comment https://github.com/openshift/machine-config-operator/pull/3105#issuecomment-1106941040 and to verify, MCD pod log should report karg mismatch error.

@brian and others, once this bug has been verified, it should be possible to bakcport PR https://github.com/openshift/machine-config-operator/pull/3105 . Let us know if baclporting is needed and to what OCP version.

Comment 64 Rio Liu 2022-11-03 08:06:06 UTC
verified on 4.11.0-0.nightly-2022-11-02-234505

follow the instructions from https://github.com/openshift/machine-config-operator/pull/3105#issuecomment-1106941040

oc debug node/ip-10-0-133-99.ec2.internal -- chroot /host unshare -m /bin/sh -c 'mount -o remount,rw /boot && chattr +i /boot'
Temporary namespace openshift-debug-q7kqt is created for debugging node...
Starting pod/ip-10-0-133-99ec2internal-debug ...
To use host binaries, run `chroot /host`

Removing debug pod ...
Temporary namespace openshift-debug-q7kqt was removed.

> create mc to apply a karg

 cat change-worker-kernelarg-selinux.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: change-worker-kernelarg-selinux
spec:
  config:
    ignition:
      version: 3.2.0
  kernelArguments:
    - enforcing=0

oc apply -f change-worker-kernelarg-selinux.yaml
machineconfig.machineconfiguration.openshift.io/change-worker-kernelarg-selinux created

the injected worker is degraded

anno ip-10-0-133-99.ec2.internal
cloud.network.openshift.io/egress-ipconfig: '[{"interface":"eni-0f91a1812aee1534c","ifaddr":{"ipv4":"10.0.128.0/18"},"capacity":{"ipv4":14,"ipv6":15}}]'
csi.volume.kubernetes.io/nodeid: '{"ebs.csi.aws.com":"i-0fe60dd45a25d6579"}'
machine.openshift.io/machine: openshift-machine-api/ci-ln-g02329b-76ef8-zcbjn-worker-us-east-1d-66zjr
machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
machineconfiguration.openshift.io/currentConfig: rendered-worker-fa2a83b7aa4374fc6d517a8174f0f770
machineconfiguration.openshift.io/desiredConfig: rendered-worker-d037cf8c86ad05ed5c29b874cd345dbc
machineconfiguration.openshift.io/desiredDrain: drain-rendered-worker-d037cf8c86ad05ed5c29b874cd345dbc
machineconfiguration.openshift.io/lastAppliedDrain: drain-rendered-worker-d037cf8c86ad05ed5c29b874cd345dbc
machineconfiguration.openshift.io/reason: 'unexpected on-disk state validating against
  rendered-worker-d037cf8c86ad05ed5c29b874cd345dbc: Missing expected kernel arguments:
  [enforcing=0]'
machineconfiguration.openshift.io/state: Degraded

sh-4.4# journalctl -b -1 -u ostree-finalize-staged
-- Logs begin at Thu 2022-11-03 07:21:14 UTC, end at Thu 2022-11-03 08:05:23 UTC. --
Nov 03 08:00:11 ip-10-0-133-99 systemd[1]: Started OSTree Finalize Staged Deployment.
Nov 03 08:00:32 ip-10-0-133-99 systemd[1]: Stopping OSTree Finalize Staged Deployment...
Nov 03 08:00:32 ip-10-0-133-99 ostree[29195]: Finalizing staged deployment
Nov 03 08:00:34 ip-10-0-133-99 ostree[29195]: Copying /etc changes: 14 modified, 0 removed, 193 added
Nov 03 08:00:34 ip-10-0-133-99 ostree[29195]: Copying /etc changes: 14 modified, 0 removed, 193 added
Nov 03 08:00:34 ip-10-0-133-99 ostree[29195]: error: mkdir(boot/loader.0): Operation not permitted
Nov 03 08:00:34 ip-10-0-133-99 systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited status=1
Nov 03 08:00:34 ip-10-0-133-99 systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'.
Nov 03 08:00:34 ip-10-0-133-99 systemd[1]: Stopped OSTree Finalize Staged Deployment.
Nov 03 08:00:34 ip-10-0-133-99 systemd[1]: ostree-finalize-staged.service: Consumed 368ms CPU time

Comment 65 Rio Liu 2022-11-03 08:40:08 UTC
mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-fa2a83b7aa4374fc6d517a8174f0f770   False     True       True       3              2                   2                     1                      81m

Comment 67 Rio Liu 2022-11-03 10:00:15 UTC
Steps to recover the env

> unlock read-only file system
make /boot volume writeable via chroot /host unshare -m /bin/sh -c 'mount -o remount,rw /boot && chattr -i /boot'
> apply missed karg manually
rpm-ostree kargs --append=enforcing=0
> reapply the config 
touch a force file to reapply current config

Comment 74 errata-xmlrpc 2022-11-24 04:14:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.11.16 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8535

Comment 75 Red Hat Bugzilla 2023-09-18 04:35:28 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.