Bug 1953493 - worker pool went degraded due to no rpm-ostree on rhel worker during applying new mc
Summary: worker pool went degraded due to no rpm-ostree on rhel worker during applying...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Sinny Kumari
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1953475
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-26 09:04 UTC by Sinny Kumari
Modified: 2021-06-15 19:30 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1952368
Environment:
Last Closed: 2021-06-15 19:30:17 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2545 0 None open Bug 1953493: daemon: return nil for unsupported operation on an OS 2021-06-03 16:16:06 UTC
Red Hat Product Errata RHBA-2021:2267 0 None None None 2021-06-15 19:30:28 UTC

Comment 3 Michael Nguyen 2021-06-07 19:35:54 UTC
Verified on 4.6.0-0.nightly-2021-06-07-054625


$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2021-06-07-054625   True        False         3h30m   Cluster version is 4.6.0-0.nightly-2021-06-07-054625
$ cat trifecta.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: worker-extensions-usbguard
spec:
  config:
    ignition:
      version: 3.1.0
  extensions:
    - usbguard
  kernelType: realtime
  kernelArguments:
  - 'z=10'
$ oc create -f trifecta.yaml 
machineconfig.machineconfiguration.openshift.io/worker-extensions-usbguard created
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
00-worker                                          fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-master-container-runtime                        fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-master-kubelet                                  fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-worker-container-runtime                        fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
01-worker-kubelet                                  fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
99-master-generated-registries                     fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
99-master-ssh                                                                                 3.1.0             3h57m
99-worker-generated-registries                     fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
99-worker-ssh                                                                                 3.1.0             3h57m
rendered-master-ceb65787dcbeeef1c6f9ea03fc92cc83   fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
rendered-worker-d5b114ebbaaa54d7165c9405d6735619   fc95c0aa903c4269eea4128d8a423747138ea4be   3.1.0             3h52m
worker-extensions-usbguard                                                                    3.1.0             3s
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-d5b114ebbaaa54d7165c9405d6735619   True      False      False      3              3                   3                     0                      3h54m
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-d5b114ebbaaa54d7165c9405d6735619   False     True       False      3              0                   0                     0                      3h54m
$ oc get nodes
NAME                                        STATUS                     ROLES    AGE     VERSION
ip-10-0-52-43.us-east-2.compute.internal    Ready,SchedulingDisabled   worker   3h47m   v1.19.0+c3e2e69
ip-10-0-53-65.us-east-2.compute.internal    Ready                      master   3h56m   v1.19.0+c3e2e69
ip-10-0-58-166.us-east-2.compute.internal   Ready                      worker   13m     v1.19.0+c3e2e69
ip-10-0-63-175.us-east-2.compute.internal   Ready                      master   3h55m   v1.19.0+c3e2e69
ip-10-0-76-20.us-east-2.compute.internal    Ready                      worker   3h45m   v1.19.0+c3e2e69
ip-10-0-79-202.us-east-2.compute.internal   Ready                      master   3h55m   v1.19.0+c3e2e69
$ watch oc get nodes
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-1f5748ea8c6414f92bf9dd7f3942eebb   True      False      False      3              3                   3                     0                      4h9m
$ oc debug node/ip-10-0-52-43.us-east-2.compute.internal
Starting pod/ip-10-0-52-43us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm -qa | grep kernel
kernel-rt-modules-4.18.0-193.56.1.rt13.106.el8_2.x86_64
kernel-rt-modules-extra-4.18.0-193.56.1.rt13.106.el8_2.x86_64
kernel-rt-core-4.18.0-193.56.1.rt13.106.el8_2.x86_64
kernel-rt-kvm-4.18.0-193.56.1.rt13.106.el8_2.x86_64
sh-4.4# uname -a
Linux ip-10-0-52-43 4.18.0-193.56.1.rt13.106.el8_2.x86_64 #1 SMP PREEMPT RT Wed May 12 16:10:12 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
sh-4.4# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-a8217562a29572884d0ec056426d6e19ab6f4c0c2b1d339f320c0c3878e640c0/vmlinuz-4.18.0-193.56.1.rt13.106.el8_2.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/a8217562a29572884d0ec056426d6e19ab6f4c0c2b1d339f320c0c3878e640c0/0 ignition.platform.id=aws z=10
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc debug node/ip-10-0-58-166.us-east-2.compute.internal
Starting pod/ip-10-0-58-166us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.2# rpm -qa | grep kernel
kernel-tools-3.10.0-1127.el7.x86_64
kernel-tools-libs-3.10.0-1127.el7.x86_64
kernel-3.10.0-1160.25.1.el7.x86_64
ukernel-3.10.0-1127.el7.x86_64
sh-4.2# uname -a
Linux ip-10-0-58-166.us-east-2.compute.internal 3.10.0-1160.25.1.el7.x86_64 #1 SMP Tue Apr 13 18:55:45 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
sh-4.2# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-1160.25.1.el7.x86_64 root=UUID=5a000634-a1fc-467d-8ef4-5fcf5dbc6033 ro console=ttyS0,115200n8 console=tty0 net.ifnames=0 rd.blacklist=nouveau nvme_core.io_timeout=4294967295 crashkernel=auto LANG=en_US.UTF-8
sh-4.2# exit
exit
sh-4.2# exit
exit

Removing debug pod ...

Comment 5 errata-xmlrpc 2021-06-15 19:30:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.34 bux fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2267


Note You need to log in before you can comment on or make changes to this bug.